Defining vectors in a slurm job to be used in R script - Stack Overflow

I wrote an R script where the first few lines defines my working directory, and some file names to .csv

I wrote an R script where the first few lines defines my working directory, and some file names to .csv files. I am running this script by submitting it to a slurm job, and I'm wondering how I can define those before running the slurm script.

As an example my Rscript.R would look like this:

setwd(my.wd)

x <- read.csv(f1)
y <- read.csv(f2)

out.df <- rbind(x, y)

write.csv(out.df, file = "out.csv")

And I'd imagine my slurm job myjob.run would look like this:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

R

my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"

Rscript Rscript.R

However, myjob.run would not be able to run the line Rscript Rscript.R because the language has been switched from bash to R.

I wrote an R script where the first few lines defines my working directory, and some file names to .csv files. I am running this script by submitting it to a slurm job, and I'm wondering how I can define those before running the slurm script.

As an example my Rscript.R would look like this:

setwd(my.wd)

x <- read.csv(f1)
y <- read.csv(f2)

out.df <- rbind(x, y)

write.csv(out.df, file = "out.csv")

And I'd imagine my slurm job myjob.run would look like this:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

R

my.wd <- "/path/to/files"
f1 <- "file1.csv"
f2 <- "file2.csv"

Rscript Rscript.R

However, myjob.run would not be able to run the line Rscript Rscript.R because the language has been switched from bash to R.

Share Improve this question asked Feb 3 at 4:41 cfaustocfausto 531 silver badge4 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 1

1. Rewrite your R script to retrieve command args dynamically:

## parms
ncpu <- 8  ## later use in e.g. `parallel::mclapply(..., mc.cores=ncpu)`

## Get command args
commandArgs(trailingOnly=TRUE)

## initialize flags
my_wd <- NULL
f1 <- NULL
f2 <- NULL

## loop through args
for (i in seq_along(args)) {
  arg <- args[i]
  if (arg == '-d' && length(args) > i) {
    my_wd <- args[i + 1]
    i <- i + 1
  }
  else if (arg == '-f1' && length(args) > i) {
    f1 <- args[i + 1]
    i <- i + 1
  }
  else if (arg == '-f2' && length(args) > i) {
    f2 <- args[i + 1]
  }
}   

setwd(my_wd)

x <- read.csv(f1)
y <- read.csv(f2)

out_df <- rbind(x, y)

write.csv(out_df, file = "./out.csv")  ## specify full path!

2. In your bash script, use flags:

#!/bin/bash

#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=1GB
#SBATCH --time=1:00:00

module load r/4.4.2

#R  ## shouldn't be necessary

## parms
R_script_path="./Rscript.R"
my_wd="/path/to/files"
f1="file1.csv"
f2="file2.csv"

## run script
Rscript --vanilla Rscript.R -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"

Called via

$ sbatch foo.sh

Alternatively, you could provide the flags in the command line:

## parms
R_script_path="./Rscript.R"
#my_wd="/path/to/files"
#f1="file1.csv"
#f2="file2.csv"

## run script
Rscript --vanilla "$R_script_path" "$@"

Then called via

$ sbatch foo.sh -d "/path/to/files" -f1 "file1.csv" -f2 "file2.csv"

PS: Avoid periods . in R object names, they can have a special meaning (e.g. methods). Use underscores _ instead.

You should pass the variables as command-line arguments in your Slurm script.

Then, modify Rscript.R to accept arguments.

This way, the Slurm script remains in Bash, and R receives the necessary variables as arguments.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745249206a4618574.html

相关推荐

  • Defining vectors in a slurm job to be used in R script - Stack Overflow

    I wrote an R script where the first few lines defines my working directory, and some file names to .csv

    8小时前
    40

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信