Skip to content

Commit

Permalink
update slurm sbatch examples
Browse files Browse the repository at this point in the history
  • Loading branch information
gagandaroach committed Feb 2, 2021
1 parent 8e985a1 commit 14c0608
Show file tree
Hide file tree
Showing 10 changed files with 80 additions and 157 deletions.
2 changes: 1 addition & 1 deletion slurm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ To execute these scripts, use sbatch on the management nodes.
$ sbatch train_1gpu_t4.sh
```

**Note:** Read **srun** and **sbatch** manual pages for detailed run parameters.
**Note:** Read **srun** and **sbatch** manual pages for detailed run parameters. `$ man sbatch`.
18 changes: 18 additions & 0 deletions slurm/cpu_basic.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#!/bin/bash

#SBATCH --job-name="Sbatch Example"
#SBATCH --output=cpu_job_%j.out
#SBATCH --mail-type=ALL
#SBATCH [email protected]
#SBATCH --partition=teaching
#SBATCH --nodes=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=2GB

## SCRIPT START

srun echo "Hello from the executing node!"
srun hostname
srun python --version

## SCRIPT END
21 changes: 21 additions & 0 deletions slurm/dgx_singularity.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

#SBATCH --job-name="DGX 8 GPU"
#SBATCH --output=train_%N_%j.out
#SBATCH --mail-type=ALL
#SBATCH [email protected]
#SBATCH --partition=dgx
#SBATCH --nodes=1
#SBATCH --gres=gpu:v100:8
#SBATCH --cpus-per-gpu=8

SCRIPT_NAME="Rosie DGX Script"
CONTAINER="/data/containers/msoe-tensorflow.sif"
SCRIPT_PATH=""
SCRIPT_ARGS=""

## SCRIPT
echo "SBATCH SCRIPT: ${SCRIPT_NAME}"
srun hostname; pwd; date;
srun singularity exec --nv -B /data:/data ${CONTAINER} python3 ${SCRIPT_PATH} ${SCRIPT_ARGS}
echo "END: " $SCRIPT_NAME
24 changes: 0 additions & 24 deletions slurm/sbatch_example.sh

This file was deleted.

30 changes: 0 additions & 30 deletions slurm/schedule_python_script.sh

This file was deleted.

19 changes: 19 additions & 0 deletions slurm/t4_gpu.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#!/bin/bash

#SBATCH --job-name="Sbatch Example"
#SBATCH --output=job_%j.out
#SBATCH --mail-type=ALL
#SBATCH [email protected]
#SBATCH --partition=teaching
#SBATCH --nodes=1
#SBATCH --gres=gpu:t4:1
#SBATCH --cpus-per-gpu=4

## SCRIPT START

srun echo "Hello from the executing node!"
srun hostname
srun python --version
srun nvidia-smi

## SCRIPT END
21 changes: 21 additions & 0 deletions slurm/t4_singularity.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

#SBATCH --job-name="Sbatch Example"
#SBATCH --output=job_%j.out
#SBATCH --mail-type=ALL
#SBATCH [email protected]
#SBATCH --partition=teaching
#SBATCH --nodes=1
#SBATCH --gres=gpu:t4:1
#SBATCH --cpus-per-gpu=4

SCRIPT_NAME="Rosie Job Script"
CONTAINER="/data/containers/msoe-tensorflow.sif"
SCRIPT_PATH=""
SCRIPT_ARGS=""

## SCRIPT
echo "SBATCH SCRIPT: ${SCRIPT_NAME}"
srun hostname; pwd; date;
srun singularity exec --nv -B /data:/data ${CONTAINER} python3 ${SCRIPT_PATH} ${SCRIPT_ARGS}
echo "END: " $SCRIPT_NAME
34 changes: 0 additions & 34 deletions slurm/train_1gpu_t4.sh

This file was deleted.

34 changes: 0 additions & 34 deletions slurm/train_2gpu_t4.sh

This file was deleted.

34 changes: 0 additions & 34 deletions slurm/train_8gpu_dgx.sh

This file was deleted.

0 comments on commit 14c0608

Please sign in to comment.