python - loading a dataset with datasets.load_dataset is hanging

I'm trying to load some data using datasets.load_datasets. It runs correctly on a head node. The issue is happening on a slurm node. I'm using a conda env with datasets installed.

When I run on head node with the conda env active, this command works:

python -c "from datasets import load_dataset; d=load_dataset(\"json\", data_files={\"train\": \"/scratch/train/shard1.jsonl\"}); print(d)"

The issue occurs when I submit the job to the cluster. This hangs:

salloc --nodes 1 --qos interactive --time 00:15:00 --constraint gpu --account=my_account  --mem=1G --gres=gpu:1

srun --nodes=1 --ntasks-per-node=1 --constraint=gpu --account=my_account --gres=gpu:1 \
    bash -c '
    source /global/homes/my_username/miniconda3/etc/profile.d/conda.sh &&
    conda activate my_env &&
    python -c "from datasets import load_dataset; load_dataset(\"json\", data_files={\"train\": \"/scratch/my_username/train/shard1.jsonl\"})"
    '

I get similar behavior when I submit with sbatch. I'm using a tiny data file to test this:

{"text": "ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT"}

I'm trying to load some data using datasets.load_datasets. It runs correctly on a head node. The issue is happening on a slurm node. I'm using a conda env with datasets installed.

When I run on head node with the conda env active, this command works:

python -c "from datasets import load_dataset; d=load_dataset(\"json\", data_files={\"train\": \"/scratch/train/shard1.jsonl\"}); print(d)"

The issue occurs when I submit the job to the cluster. This hangs:

salloc --nodes 1 --qos interactive --time 00:15:00 --constraint gpu --account=my_account  --mem=1G --gres=gpu:1

srun --nodes=1 --ntasks-per-node=1 --constraint=gpu --account=my_account --gres=gpu:1 \
    bash -c '
    source /global/homes/my_username/miniconda3/etc/profile.d/conda.sh &&
    conda activate my_env &&
    python -c "from datasets import load_dataset; load_dataset(\"json\", data_files={\"train\": \"/scratch/my_username/train/shard1.jsonl\"})"
    '

I get similar behavior when I submit with sbatch. I'm using a tiny data file to test this:

{"text": "ACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT"}

Share Improve this question asked Mar 10 at 20:01 ate50eggs 4543 silver badges14 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

The issue was an incompatibility between my cluster's filesystem and the caching behavior. using the --cache_dir flag to point at the worker node's tmp

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1744825371a4595761.html

python - loading a dataset with datasets.load_dataset is hanging - Stack Overflow

1 Answer 1

发表回复

评论列表（0条）

联系我们

400-800-8888

python - loading a dataset with datasets.load_dataset is hanging - Stack Overflow

1 Answer 1

相关推荐