• Jump to content
  • Jump to navigation
  • Jump to bottom of page
Simulate organization breadcrumb open Simulate organization breadcrumb close
  • FAUTo the central FAU website
  • RRZE
  • NHR-Geschäftsstelle
  • Gauß-Allianz

Navigation Navigation close
  • News
  • People
  • Research
    • Research Focus
    • Publications, Posters and Talks
    • Software & Tools
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • NHR PerfLab Seminar
    • Projects
    • Awards
    Portal Research
  • Teaching & Training
    • Lectures and Seminars
    • Tutorials and Courses
    • Theses
    • HPC Cafe
    • Student Cluster Competition
    Portal Teaching
  • Systems & Services
    • Systems, Documentation & Instructions
    • Support & Contact
    • Success Stories from the Support
    • Training Resources
    • Summary of System Utilization
    • Reports from User Projects
    Portal Systems & Services

  1. Home
  2. Systems & Services
  3. Systems, Documentation & Instructions
  4. Batch Processing

Batch Processing

In page navigation: Systems & Services
  • Systems, Documentation & Instructions
    • Getting started with HPC
    • NHR application rules – NHR@FAU
    • HPC clusters & systems
      • Dialog server
      • Alex GPGPU cluster (NHR+Tier3)
      • Fritz parallel cluster (NHR+Tier3)
      • Meggie parallel cluster (Tier3)
      • Emmy parallel cluster (Tier3)
      • Woody throughput cluster (Tier3)
      • TinyFat cluster (Tier3)
      • TinyGPU cluster (Tier3)
      • Test cluster
      • Jupyterhub
    • SSH – Secure Shell access to HPC systems
    • File systems
    • Batch Processing
      • Job script examples – Slurm
      • Advanced topics Slurm
      • Torque batch system
    • Software environment
    • Special applications, and tips & tricks
      • Amber/AmberTools
      • ANSYS CFX
      • ANSYS Fluent
      • ANSYS Mechanical
      • Continuous Integration / Gitlab Cx
      • CP2K
      • CPMD
      • GROMACS
      • IMD
      • Intel MKL
      • LAMMPS
      • Matlab
      • NAMD
      • OpenFOAM
      • ORCA
      • Python and Jupyter
      • Quantum Espresso
      • R and R Studio
      • STAR-CCM+
      • Tensorflow and PyTorch
      • TURBOMOLE
      • VASP
        • Request access to central VASP installation
      • Working with NVIDIA GPUs
      • WRF
  • Support & Contact
    • Monthly HPC Cafe
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • Support Success Stories
      • Success story: Elmer/Ice
  • HPC User Training
  • HPC System Utilization
  • User projects
    • Biology, life sciences & pharmaceutics
      • HPC User Report from A. Bochicchio (Professorship of Computational Biology)
      • HPC User Report from A. Horn (Bioinformatics)
      • HPC User Report from C. Söldner (Professorship for Bioinformatics)
      • HPC User Report from J. Calderón (Computer Chemistry Center)
      • HPC User Report from J. Kaindl (Chair of Medicinal Chemistry)
      • HPC User Report from K. Pluhackova (Computational Biology Group)
    • Chemical & mechanical engineering
      • HPC User Report from A. Leonardi (Institute for Multiscale Simulation)
      • HPC User Report from F. Lenahan (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from F. Weber (Chair of Applied Mechanics)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from L. Eckendörfer (Catalytic Reactors and Process Technology)
      • HPC User Report from M. Klement (Institute for Multiscale Simulation)
      • HPC User Report from M. Münsch (Chair of Fluid Mechanics)
      • HPC User Report from T. Klein (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from T. Schikarski (Chair of Fluid Mechanics / Chair of Particle Technology)
      • HPC User Report from U. Higgoda (Institute of Advanced Optical Technologies – Thermophysical Properties)
    • Chemistry
      • HPC User Report from B. Becit (Professorship of Theoretical Chemistry)
      • HPC User Report from B. Meyer (Computational Chemistry – ICMM)
      • HPC User Report from D. Munz (Chair of Inorganic and General Chemistry)
      • HPC User Report from J. Konrad (Professorship of Theoretical Chemistry)
      • HPC User Report from P. Schwarz (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Frühwald (Chair of Theoretical Chemistry)
      • HPC User Report from S. Maisel (Chair of Theoretical Chemistry)
      • HPC User Report from S. Sansotta (Professorship of Theoretical Chemistry)
      • HPC User Report from S. Seiler (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Trzeciak (Professorship of Theoretical Chemistry)
      • HPC User Report from T. Klöffel (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from T. Kollmann (Professorship of Theoretical Chemistry)
    • Computer science & Mathematics
      • HPC User Report from B. Jakubaß & S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from D. Schuster (Chair for System Simulation)
      • HPC User Report from F. Wein (Professorship for Mathematical Optimization)
      • HPC User Report from J. Hornich (Professur für Höchstleistungsrechnen)
      • HPC User Report from L. Folle and K. Tkotz (Chair of Computer Science 5, Pattern Recognition)
      • HPC User Report from R. Burlacu (Economics, Discrete Optimization, and Mathematics)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Falk (Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Jacob (Chair of System Simulation)
    • Electrical engineering & Audio processing
      • HPC User Report from N. Pia (AudioLabs)
      • HPC User Report from S. Balke (Audiolabs)
    • Geography & Climatology
      • HPC usage report from F. Temme, J. V. Turton, T. Mölg and T. Sauter
      • HPC usage report from J. Turton, T. Mölg and E. Collier
      • HPC usage report from N. Landshuter, T. Mölg, J. Grießinger, A. Bräuning and T. Peters
      • HPC User Report from C. Pickler and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier (Climate System Research Group)
      • HPC User Report from E. Collier and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier, T. Sauter, T. Mölg & D. Hardy (Climate System Research Group, Institute of Geography)
      • HPC User Report from E. Kropač, T. Mölg, N. J. Cullen, E. Collier, C. Pickler, and J. V. Turton (Climate System Research Group)
      • HPC User Report from J. Fürst (Department of Geography)
      • HPC User Report from P. Friedl (Department of Geography)
      • HPC User Report from T. Mölg (Climate System Research Group)
    • Linguistics
      • HPC User Report from P. Uhrig (Chair of English Linguistics)
    • Material sciences
      • HPC User Report from A. Rausch (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from D. Wei (Chair of Materials Simulation)
      • HPC User Report from J. Köpf (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from P. Baranova (Chair of General Materials Properties)
      • HPC User Report from S. Nasiri (Chair for Materials Simulation)
      • HPC User Report from S.A. Hosseini (Chair for Materials Simulation)
    • Medical research
      • HPC User Report from H. Sadeghi (Phoniatrics and Pediatric Audiology)
      • HPC User Report from P. Ritt (Imaging and Physics Group, Clinic of Nuclear Medicine)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
    • Physics
      • HPC User Report from D. Jankowsky (High-Energy Astrophysics)
      • HPC User Report from M. Maiti (Inst. Theoretische Physik 1)
      • HPC User Report from N. Vučemilović-Alagić (PULS group of the Physics Department)
      • HPC User Report from O. Malcioglu (Theoretische Festkörperphysik)
      • HPC User Report from S. Fey (Chair of Theoretical Physics I)
      • HPC User Report from S. Ninova (Theoretical Solid-State Physics)
      • HPC User Report from S. Schmidt (Erlangen Centre for Astroparticle Physics)
    • Regional users and student projects
      • HPC User Report from Dr. N. Ferruz (University of Bayreuth)
      • HPC User Report from J. Martens (Comprehensive Heart Failure Center / Universitätsklinikum Würzburg)
      • HPC User Report from M. Fritsche (HS-Coburg)
      • HPC User Report from M. Heß (TH-Nürnberg)
      • HPC User Report from M. Kögel (TH-Nürnberg)
  • NHR compute time projects

Batch Processing

When logging into an HPC system, you are placed on a login node. From there, you can manage your data, set up your workflow, and prepare and submit jobs. The compute nodes cannot be accessed directly, but run under the control of a batch system. The batch system handles the queuing of jobs into different partitions (depending on the needed resources, e.g. runtime) and sorting according to some priority scheme. A job will run when the required resources become available.

The login nodes are not suitable for computational work, since they are shared among all users. We do not allow MPI-parallel applications on the frontends, short parallel test runs must be performed using batch jobs. It is also possible to submit interactive batch jobs that, when started, open a shell on one of the assigned compute nodes and let you run interactive programs there. On most clusters, a number of nodes are reserved during working hours for short test runs with less than one hour of runtime.

This documentation gives you a general overview of how to use the Slurm batch system and is applicable to all clusters (except woody, emmy and parts of TinyGPU). For more cluster-specific information, consult the respective cluster documentation!

The woody and emmy cluster and parts of TinyGPU currently do not use Slurm but Torque as a batch system. Please refer to the Torque batch system documentation!

 

The basic usage of the Slurm batch system is outlined on this page. Information on the following topics is given below:

  • Batch Job Submission
  • Interactive Jobs
  • Options for sbatch/salloc/srun
  • Environment Variables
  • Manage and Control Jobs
  • Job Scripts – General structure

For more detailed information, please refer to the official Slurm documentation and the official Slurm tutorials .

Batch job submission

Apart from short test runs and interactive work, it is recommended to submit your jobs by using the command sbatch. It will queue the job for later execution when the specified resources become available. You can either specify the resources via command-line options or more conveniently directly in your job file using the script directive #SBATCH. The job file is basically a script stating the resource requirements, environment settings, and commands for executing the application. Examples are given below.

The batch file is submitted by using

sbatch [options] job_file

After submission, sbatch will output the Job ID of your job. It can later be used for identification purposes and is also available as the environment variable $SLURM_JOBID in job scripts.

For TinyFat and TinyGPU, use the respective command wrapper sbatch.tinyfat/sbatch.tinygpu.

Interactive jobs

For interactive work and test runs, the command salloc can be used to get an interactive shell on a compute node. After issuing salloc, do not close your terminal session but wait until the resources become available. You will directly be logged into the first granted compute node. When you close your terminal, the allocation will automatically be revoked. There is currently no way to request X11 forwarding to an interactive Slurm job.

To run an interactive job with Slurm on Meggie, Alex and Fritz:

salloc [Options for number of nodes, walltime, etc.]

For TinyFat and TinyGPU, use the respective command wrapper salloc.tinyfat/salloc.tinygpu.

Options for sbatch/salloc/srun

The following parameters can be specified as options for sbatch, salloc, andsrun or included in the job script by using the script directive #SBATCH:

--job-name=<name> Specifies the name which is shown with squeue. If the option is omitted, the name of the batch script file is used.
--nodes=<number> Specifies the number of nodes requested. Default value is 1.
--ntasks=<number> Overall number of tasks (MPI processes). Can be omitted if –nodes and –ntasks-per-node are given. Default value is 1.
--ntasks-per-node=<number> Number of tasks (MPI processes) per node.
--cpus-per-task=<number> Number of threads (logical cores) per task. Used for OpenMP or hybrid jobs.
--time=HH:MM:SS Specifies the required wall clock time (runtime). When the job reaches the walltime given here it will be sent a TERM signal. After a few seconds, if the job has not ended yet, it will be sent KILL. If you omit the walltime option, a – very short – default time will be used. Please specify a reasonable runtime, since the scheduler bases its decisions also on this value (short jobs are preferred).
--mail-user=<address>--mail-type=<type> You will get an e-mail to <address> depending on the type you have specified. As a type, you can choose either BEGIN, END, FAIL, TIME_LIMIT or ALL.  Specifying more than one option is also possible.
--output=<file_name> Filename for the standard output stream. This should not be used, since a suitable name is automatically compiled from the job name and the job ID.
--error=<file_name> Filename for the standard error stream. Per default, stderr is merged with stdout.
--partition=<partition> Specifies the partition/queue to which the job is submitted. If no partition is given, the default partition of the respective cluster is used (see cluster documentation).
--constraint=hwperf Access to hardware performance counters (e.g. using likwid-perfctr). Only request this feature if you really want to access the hardware performance counters!
likwid-perfctr is not required for e.g. likwid-pin or likwid-mpirun.
--export=none Only available for sbatch. Environment variables of the submission environment (e.g. PATH set by modules) will not be exported to the submitted job. Must be combined with unset SLURM_EXPORT_ENV inside job script to ensure proper execution of the application (see notes below).

Many more options are available. For details, refer to the official Slurm documentation for sbatch, salloc or srun.

Environment variables

The scheduler typically sets environment variables to tell the job about what resources were allocated to it. These can also be used in batch scripts. A complete list can be found in the official Slurm documentation. The most useful are given below:

Job ID $SLURM_JOB_ID
Directory from which the job was submitted $SLURM_SUBMIT_DIR
List of nodes on which job runs $SLURM_JOB_NODELIST
Number of nodes allocated to job $SLURM_JOB_NUM_NODES
Number of cores per task; set $OMP_NUM_THREADS to this value for OpenMP/hybrid applications $SLURM_CPUS_PER_TASK

SLURM automatically propagates environment variables that are set in the shell at the time of submission into the Slurm job. This includes currently loaded module files. To have a clean environment in job scripts, it is recommended to add #SBATCH --export=NONE and unset SLURM_EXPORT_ENV to the job script. Otherwise, the job will inherit some settings from the submitting shell. The additional un-setting of SLURM_EXPORT_ENV inside the job script ensures propagation of all Slurm-specific variables and loaded modules to the srun call. Specifying export SLURM_EXPORT_ENV=ALL is equivalent to unset SLURM_EXPORT_ENV and can be used interchangeably.

Manage and control jobs

Job and cluster status

squeue <options> Displays status information on queued jobs. Only the user’s own jobs are displayed. -t running display currently running jobs
-j <JobID> display info on job <JobID>
scontrol show job <JobID> Displays very detailed information on jobs.
sinfo Overview of cluster status. Shows available partitions and availability of nodes.

Editing jobs

If your job is not running yet, it is possible to change details of the resource allocation, e.g. the runtime with scontrol update timelimit=4:00:00 jobid=<jobid>. For more details and available options, see the official documentation.

Cancelling jobs

To cancel a job and remove it from the queue use scancel. It will remove queued as well as running jobs. To cancel all your jobs at once use scancel -u <your_username>.

Job scripts – general structure

A batch or job file is generally a script holding information like resource allocations, environment specifications, and commands to execute an application during the runtime of the job.  The following example shows the general structure of a job script. More detailed examples are available in Job Script Examples.

#!/bin/bash -l                     # Batch script starts with shebang line
#                                  # -l is necessary to initialize modules correctly!
#SBATCH --ntasks=20                # All #SBATCH lines have to follow uninterrupted
#SBATCH --time=01:00:00            # comments start with # and do not count as interruptions
#SBATCH --job-name=fancyExp 
#SBATCH --export=NONE              # do not export environment from submitting shell
                                   # first non-empty non-comment line ends SBATCH options
unset SLURM_EXPORT_ENV             # enable export of environment from this script to srun

module load <modules>              # Load necessary modules

srun ./application [options]       # Execute parallel application with srun

 

Erlangen National High Performance Computing Center (NHR@FAU)
Martensstraße 1
91058 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • How to find us
Up