Defining vectors in a slurm job to be used in R script

I wrote an R script where the first few lines defines my working directory, and some file names to .csv files. I am running this script by submitting it to a slurm job, and I'm wondering how I can define those before running the slurm script.

As an example my Rscript.R would look like this:

setwd(my.wd)

x <- read.csv(f1)
y <- read.csv(f2)

out.df <- rbind(x, y)

write.csv(out.df, file = "out.csv")

And I'd imagine my slurm job myjob.run would look like this:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

R

my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"

Rscript Rscript.R

However, myjob.run would not be able to run the line Rscript Rscript.R because the language has been switched from bash to R.

As an example my Rscript.R would look like this:

setwd(my.wd)

x <- read.csv(f1)
y <- read.csv(f2)

out.df <- rbind(x, y)

write.csv(out.df, file = "out.csv")

And I'd imagine my slurm job myjob.run would look like this:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

R

my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"

Rscript Rscript.R

However, myjob.run would not be able to run the line Rscript Rscript.R because the language has been switched from bash to R.

Share Improve this question asked Feb 3 at 4:41 cfausto 531 silver badge4 bronze badges

Add a comment |

2 Answers 2

Sorted by: Reset to default 1

1. Rewrite your R script to retrieve command args dynamically:

## parms
ncpu <- 8  ## later use in e.g. `parallel::mclapply(..., mc.cores=ncpu)`

## Get command args
commandArgs(trailingOnly=TRUE)

## initialize flags
my_wd <- NULL
f1 <- NULL
f2 <- NULL

## loop through args
for (i in seq_along(args)) {
  arg <- args[i]
  if (arg == '-d' && length(args) > i) {
    my_wd <- args[i + 1]
    i <- i + 1
  }
  else if (arg == '-f1' && length(args) > i) {
    f1 <- args[i + 1]
    i <- i + 1
  }
  else if (arg == '-f2' && length(args) > i) {
    f2 <- args[i + 1]
  }
}   

setwd(my_wd)

x <- read.csv(f1)
y <- read.csv(f2)

out_df <- rbind(x, y)

write.csv(out_df, file = "./out.csv")  ## specify full path!

2. In your bash script, use flags:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

#R  ## shouldn't be necessary

## parms
R_script_path="./Rscript.R"
my_wd="/path/to/files"
f1="file1.csv"
f2="file2.csv"

## run script
Rscript --vanilla Rscript.R -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"

Called via

$ sbatch foo.sh

Alternatively, you could provide the flags in the command line:

## parms
R_script_path="./Rscript.R"
#my_wd="/path/to/files"
#f1="file1.csv"
#f2="file2.csv"

## run script
Rscript --vanilla "$R_script_path" "$@"

Then called via

$ sbatch foo.sh -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"

PS: Avoid periods . in R object names, they can have a special meaning (e.g. methods). Use underscores _ instead.

You should pass the variables as command-line arguments in your Slurm script.

Then, modify Rscript.R to accept arguments.

This way, the Slurm script remains in Bash, and R receives the necessary variables as arguments.

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1745249206a4618574.html

Defining vectors in a slurm job to be used in R script - Stack Overflow

2 Answers 2

1. Rewrite your R script to retrieve command args dynamically:

2. In your bash script, use flags:

发表回复

评论列表（0条）

联系我们

400-800-8888

Defining vectors in a slurm job to be used in R script - Stack Overflow

2 Answers 2

1. Rewrite your R script to retrieve command args dynamically:

2. In your bash script, use flags:

相关推荐