I wrote an R script where the first few lines defines my working directory, and some file names to .csv files. I am running this script by submitting it to a slurm job, and I'm wondering how I can define those before running the slurm script.
As an example my Rscript.R
would look like this:
setwd(my.wd)
x <- read.csv(f1)
y <- read.csv(f2)
out.df <- rbind(x, y)
write.csv(out.df, file = "out.csv")
And I'd imagine my slurm job myjob.run
would look like this:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00
module load r/4.4.2
R
my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"
Rscript Rscript.R
However, myjob.run
would not be able to run the line Rscript Rscript.R
because the language has been switched from bash
to R
.
I wrote an R script where the first few lines defines my working directory, and some file names to .csv files. I am running this script by submitting it to a slurm job, and I'm wondering how I can define those before running the slurm script.
As an example my Rscript.R
would look like this:
setwd(my.wd)
x <- read.csv(f1)
y <- read.csv(f2)
out.df <- rbind(x, y)
write.csv(out.df, file = "out.csv")
And I'd imagine my slurm job myjob.run
would look like this:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00
module load r/4.4.2
R
my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"
Rscript Rscript.R
However, myjob.run
would not be able to run the line Rscript Rscript.R
because the language has been switched from bash
to R
.
2 Answers
Reset to default 11. Rewrite your R script to retrieve command args dynamically:
## parms
ncpu <- 8 ## later use in e.g. `parallel::mclapply(..., mc.cores=ncpu)`
## Get command args
commandArgs(trailingOnly=TRUE)
## initialize flags
my_wd <- NULL
f1 <- NULL
f2 <- NULL
## loop through args
for (i in seq_along(args)) {
arg <- args[i]
if (arg == '-d' && length(args) > i) {
my_wd <- args[i + 1]
i <- i + 1
}
else if (arg == '-f1' && length(args) > i) {
f1 <- args[i + 1]
i <- i + 1
}
else if (arg == '-f2' && length(args) > i) {
f2 <- args[i + 1]
}
}
setwd(my_wd)
x <- read.csv(f1)
y <- read.csv(f2)
out_df <- rbind(x, y)
write.csv(out_df, file = "./out.csv") ## specify full path!
2. In your bash script, use flags:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00
module load r/4.4.2
#R ## shouldn't be necessary
## parms
R_script_path="./Rscript.R"
my_wd="/path/to/files"
f1="file1.csv"
f2="file2.csv"
## run script
Rscript --vanilla Rscript.R -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"
Called via
$ sbatch foo.sh
Alternatively, you could provide the flags in the command line:
## parms
R_script_path="./Rscript.R"
#my_wd="/path/to/files"
#f1="file1.csv"
#f2="file2.csv"
## run script
Rscript --vanilla "$R_script_path" "$@"
Then called via
$ sbatch foo.sh -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"
PS: Avoid periods .
in R object names, they can have a special meaning (e.g. methods). Use underscores _
instead.
You should pass the variables as command-line arguments in your Slurm script.
Then, modify Rscript.R to accept arguments.
This way, the Slurm script remains in Bash, and R receives the necessary variables as arguments.
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745249206a4618574.html
评论列表(0条)