Job script examples - Slurm

In the following, example batch scripts for different types of jobs are given. Note that all of them have to be adapted to your specific application and the target system. You can find more examples in the cluster-specific documentation.

Parallel jobs
GPU jobs
- Single-node jobs

Parallel jobs

For submitting parallel jobs, a few rules have to be understood and followed. In general, they depend on the type of parallelization and architecture.

Job script with OpenMP

OpenMP is not Slurm-aware, so you need to specify OMP_NUM_THREADS in your script. It should match the number of cores requested via --cpus-per-task.

OpenMP applications can only make use of one node, therefore --nodes=1 and --ntasks-per-node=1 are necessary. The number of allocated CPUs --cpus-per-task and therefore OpenMP threads depends on the system and has to be adjusted accordingly.

For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores, OMP_PROC_BIND=true. For more information, see e.g. the HPC Wiki.

#!/bin/bash -l

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=64
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV
module load <modules>

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

./openmpi_application

Job script with MPI

For pure MPI jobs, you have two different options of defining the allocated resources depending on the needs of your application and the intended distribution of tasks on the compute nodes.

Option 1: define the total number of MPI processes that should be started via --ntasks. Adjust the value according to the available CPUs per node of the respective cluster. Beware that this might lead to load imbalances if the --ntasks does not correspond to “full” nodes!

#!/bin/bash -l

#SBATCH --ntasks=64
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV
module load <modules>

srun ./mpi_application

Option 2: define the number of MPI processes that should be started via the number of nodes --nodes and --ntasks-per-node. Use this formulation to better define the distribution of MPI processes on the nodes, especially if not using one process per physical core! --ntasks-per-node has to be adapted to the hardware of the specific cluster.

#!/bin/bash -l

#SBATCH --nodes=4
#SBATCH --ntasks-per-node=20
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV
module load <modules>

srun ./mpi_application

Job script for hybrid MPI/OpenMP, single-node

Adjust the number of MPI processes via --ntasks and the corresponding number of OpenMP threads per MPI process via --cpus-per-task according to the available hardware configuration and the needs of your application.

OpenMP is not Slurm-aware, so you need to specify OMP_NUM_THREADS in your script. It should match the number of cores requested via --cpus-per-task.

#!/bin/bash -l

#SBATCH --ntasks=2
#SBATCH --cpus-per-task=8
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV

# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# for Slurm version >22.05: cpus-per-task has to be set again for srun
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun ./hybrid_application

Job script for hybrid MPI/OpenMP, multi-node

Adjust the number of MPI processes via --nodes and --ntasks-per-node and the corresponding number of OpenMP threads per MPI process via --cpus-per-task according to the available hardware configuration and the needs of your application.

OpenMP is not Slurm-aware, so you need to specify OMP_NUM_THREADS in your script. It should match the number of cores requested via --cpus-per-task.

#!/bin/bash -l

#SBATCH --nodes=2
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=8
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV

# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
# for Slurm version >22.05: cpus-per-task has to be set again for srun
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
srun ./hybrid_application

GPU jobs

GPUs are only available in TinyGPU and Alex. To submit a job to one of those clusters, you have to specify the number of GPUs that should be allocated to your job. This is done via the Slurm option --gres=gpu:<number_gpus_per_node>. The available number of GPUs per node differs between 4 and 8, depending on the cluster and node configuration.

Single-node job

The number of requested GPUs has to be smaller or equal to the available number of GPUs per node. In this case, the compute nodes are not allocated exclusively but are shared among several jobs – the GPUs themselves are always granted exclusively. The corresponding share of the resources of the host system (CPU cores, RAM) is automatically allocated.

If you are using hybrid OpenMP/MPI or pure MPI code, adjust --cpus-per-task or --ntasks according to the above examples. Do not specify more cores/tasks than available for the number of requested GPUs!

#!/bin/bash -l
#SBATCH --gres=gpu:1
#SBATCH --time=01:00:00
srun ./cuda_application

Job script examples – Slurm

Parallel jobs

Job script with OpenMP

Job script with MPI

Job script for hybrid MPI/OpenMP, single-node

Job script for hybrid MPI/OpenMP, multi-node

GPU jobs

Single-node job