• Jump to content
  • Jump to navigation
  • Jump to bottom of page
Simulate organization breadcrumb open Simulate organization breadcrumb close
  • FAUTo the central FAU website
  • RRZE
  • NHR-Geschäftsstelle
  • Gauß-Allianz

Navigation Navigation close
  • News
  • People
  • Research
    • Research Focus
    • Publications, Posters and Talks
    • Software & Tools
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • NHR PerfLab Seminar
    • Projects
    • Awards
    Portal Research
  • Teaching & Training
    • Lectures and Seminars
    • Tutorials and Courses
    • Theses
    • HPC Cafe
    • Student Cluster Competition
    Portal Teaching
  • Systems & Services
    • Systems, Documentation & Instructions
    • Support & Contact
    • Success Stories from the Support
    • Training Resources
    • Summary of System Utilization
    • Reports from User Projects
    Portal Systems & Services

  1. Home
  2. Systems & Services
  3. Systems, Documentation & Instructions
  4. Special applications, and tips & tricks
  5. GROMACS

GROMACS

In page navigation: Systems & Services
  • Systems, Documentation & Instructions
    • Getting started with HPC
      • NHR@FAU HPC-Portal Usage
    • NHR application rules – NHR@FAU
    • HPC clusters & systems
      • Dialog server
      • Alex GPGPU cluster (NHR+Tier3)
      • Fritz parallel cluster (NHR+Tier3)
      • Meggie parallel cluster (Tier3)
      • Emmy parallel cluster (Tier3)
      • Woody throughput cluster (Tier3)
      • TinyFat cluster (Tier3)
      • TinyGPU cluster (Tier3)
      • Test cluster
      • Jupyterhub
    • SSH – Secure Shell access to HPC systems
    • File systems
    • Batch Processing
      • Job script examples – Slurm
      • Advanced topics Slurm
      • Torque batch system
    • Software environment
    • Special applications, and tips & tricks
      • Amber/AmberTools
      • ANSYS CFX
      • ANSYS Fluent
      • ANSYS Mechanical
      • Continuous Integration / Gitlab Cx
      • CP2K
      • CPMD
      • GROMACS
      • IMD
      • Intel MKL
      • LAMMPS
      • Matlab
      • NAMD
      • OpenFOAM
      • ORCA
      • Python and Jupyter
      • Quantum Espresso
      • R and R Studio
      • STAR-CCM+
      • Tensorflow and PyTorch
      • TURBOMOLE
      • VASP
        • Request access to central VASP installation
      • Working with NVIDIA GPUs
      • WRF
  • Support & Contact
    • Monthly HPC Cafe
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • Support Success Stories
      • Success story: Elmer/Ice
  • HPC User Training
  • HPC System Utilization
  • User projects
    • Biology, life sciences & pharmaceutics
      • HPC User Report from A. Bochicchio (Professorship of Computational Biology)
      • HPC User Report from A. Horn (Bioinformatics)
      • HPC User Report from C. Söldner (Professorship for Bioinformatics)
      • HPC User Report from J. Calderón (Computer Chemistry Center)
      • HPC User Report from J. Kaindl (Chair of Medicinal Chemistry)
      • HPC User Report from K. Pluhackova (Computational Biology Group)
    • Chemical & mechanical engineering
      • HPC User Report from A. Leonardi (Institute for Multiscale Simulation)
      • HPC User Report from F. Lenahan (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from F. Weber (Chair of Applied Mechanics)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from K. Nusser (Institute of Process Machinery and Systems Engineering)
      • HPC User Report from L. Eckendörfer (Catalytic Reactors and Process Technology)
      • HPC User Report from M. Klement (Institute for Multiscale Simulation)
      • HPC User Report from M. Münsch (Chair of Fluid Mechanics)
      • HPC User Report from T. Klein (Institute of Advanced Optical Technologies – Thermophysical Properties)
      • HPC User Report from T. Schikarski (Chair of Fluid Mechanics / Chair of Particle Technology)
      • HPC User Report from U. Higgoda (Institute of Advanced Optical Technologies – Thermophysical Properties)
    • Chemistry
      • HPC User Report from B. Becit (Professorship of Theoretical Chemistry)
      • HPC User Report from B. Meyer (Computational Chemistry – ICMM)
      • HPC User Report from D. Munz (Chair of Inorganic and General Chemistry)
      • HPC User Report from J. Konrad (Professorship of Theoretical Chemistry)
      • HPC User Report from P. Schwarz (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Frühwald (Chair of Theoretical Chemistry)
      • HPC User Report from S. Maisel (Chair of Theoretical Chemistry)
      • HPC User Report from S. Sansotta (Professorship of Theoretical Chemistry)
      • HPC User Report from S. Seiler (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from S. Trzeciak (Professorship of Theoretical Chemistry)
      • HPC User Report from T. Klöffel (Interdisciplinary Center for Molecular Materials)
      • HPC User Report from T. Kollmann (Professorship of Theoretical Chemistry)
    • Computer science & Mathematics
      • HPC User Report from B. Jakubaß & S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from D. Schuster (Chair for System Simulation)
      • HPC User Report from F. Wein (Professorship for Mathematical Optimization)
      • HPC User Report from J. Hornich (Professur für Höchstleistungsrechnen)
      • HPC User Report from L. Folle and K. Tkotz (Chair of Computer Science 5, Pattern Recognition)
      • HPC User Report from R. Burlacu (Economics, Discrete Optimization, and Mathematics)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Falk (Phoniatrics and Pediatric Audiology)
      • HPC User Report from S. Jacob (Chair of System Simulation)
    • Electrical engineering & Audio processing
      • HPC User Report from N. Pia (AudioLabs)
      • HPC User Report from S. Balke (Audiolabs)
    • Geography & Climatology
      • HPC usage report from F. Temme, J. V. Turton, T. Mölg and T. Sauter
      • HPC usage report from J. Turton, T. Mölg and E. Collier
      • HPC usage report from N. Landshuter, T. Mölg, J. Grießinger, A. Bräuning and T. Peters
      • HPC User Report from C. Pickler and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier (Climate System Research Group)
      • HPC User Report from E. Collier and T. Mölg (Climate System Research Group)
      • HPC User Report from E. Collier, T. Sauter, T. Mölg & D. Hardy (Climate System Research Group, Institute of Geography)
      • HPC User Report from E. Kropač, T. Mölg, N. J. Cullen, E. Collier, C. Pickler, and J. V. Turton (Climate System Research Group)
      • HPC User Report from J. Fürst (Department of Geography)
      • HPC User Report from P. Friedl (Department of Geography)
      • HPC User Report from T. Mölg (Climate System Research Group)
    • Linguistics
      • HPC User Report from P. Uhrig (Chair of English Linguistics)
    • Material sciences
      • HPC User Report from A. Rausch (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from D. Wei (Chair of Materials Simulation)
      • HPC User Report from J. Köpf (Chair of Materials Science and Engineering for Metals)
      • HPC User Report from P. Baranova (Chair of General Materials Properties)
      • HPC User Report from S. Nasiri (Chair for Materials Simulation)
      • HPC User Report from S.A. Hosseini (Chair for Materials Simulation)
    • Medical research
      • HPC User Report from H. Sadeghi (Phoniatrics and Pediatric Audiology)
      • HPC User Report from P. Ritt (Imaging and Physics Group, Clinic of Nuclear Medicine)
      • HPC User Report from S. Falk (Division of Phoniatrics and Pediatric Audiology)
    • Physics
      • HPC User Report from D. Jankowsky (High-Energy Astrophysics)
      • HPC User Report from M. Maiti (Inst. Theoretische Physik 1)
      • HPC User Report from N. Vučemilović-Alagić (PULS group of the Physics Department)
      • HPC User Report from O. Malcioglu (Theoretische Festkörperphysik)
      • HPC User Report from S. Fey (Chair of Theoretical Physics I)
      • HPC User Report from S. Ninova (Theoretical Solid-State Physics)
      • HPC User Report from S. Schmidt (Erlangen Centre for Astroparticle Physics)
    • Regional users and student projects
      • HPC User Report from Dr. N. Ferruz (University of Bayreuth)
      • HPC User Report from J. Martens (Comprehensive Heart Failure Center / Universitätsklinikum Würzburg)
      • HPC User Report from M. Fritsche (HS-Coburg)
      • HPC User Report from M. Heß (TH-Nürnberg)
      • HPC User Report from M. Kögel (TH-Nürnberg)
  • NHR compute time projects

GROMACS

GROMACS (GROningen MAchine for Chemical Simulations) is a molecular dynamics package primarily designed for simulations of proteins, lipids and nucleic acids.

Availability / Target HPC systems

  • TinyGPU: best value if only one GPU is used per run – use the latest versions of GROMACS as they allow more and more offloading to the GPU
  • parallel computers: experiment to find proper setting for -npme
  • throughput cluster Woody: best suited for small systems

New versions of GROMACS are installed by RRZE upon request.

Notes

GROMACS can produce large amounts of data in small increments:

  • Try to reduce the frequency and amount of data as much as possible, e.g. remove the -v flag for verbose output from the program call.
  • It also may be useful to stage the generated output in the node’s RAMdisk (i.e. in the directory /dev/shm/) first and only copy it back to e.g. $WORK once just before quitting the job.
  • The high output frequency of small amounts of data is NOT suitable for $FASTTMP.
  • For serial and single-node simulations you have to use gmx mdrun;
    for multi-node simulations, the binary to use with mpirun is mdrun_mpi or mdrun_mpi+OMP. See the sample scripts below!

Sample job scripts

serial job on Woody

#!/bin/bash -l
#PBS -lnodes=1:ppn=4,walltime=10:00:00
#PBS -N my-gmx
#PBS -j eo

cd $PBS_O_WORKDIR

module load gromacs/2019.3-mkl

### the argument of -maxh should match the requested walltime!
gmx mdrun -maxh 10 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

parallel job on Emmy

#!/bin/bash -l
#PBS -lnodes=4:ppn=40,walltime=10:00:00
#PBS -N my-gmx
#PBS -j eo

cd $PBS_O_WORKDIR

module load gromacs/2019.3-mkl-IVB

### 1) The argument of -maxh should match the requested walltime!
### 2) Performance often can be optimized if -npme # with a proper number of pme tasks is specified; 
###    experiment of use tune_mpe to find the optimal value.
###    Using the SMT threads can sometimes be beneficial, however, requires testing.
mpirun -n 80 mdrun_mpi [-npme #] -maxh 10 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

parallel job on Meggie

#!/bin/bash -l
#
# allocate 4 nodes with 20 cores per node = 4*20 = 80 MPI tasks
#SBATCH --nodes=4
#SBATCH --tasks-per-node=20
#
# allocate nodes for 6 hours
#SBATCH --time=06:00:00
# job name 
#SBATCH --job-name=my-gmx
# do not export environment variables
#SBATCH --export=NONE
#
# first non-empty non-comment line ends SBATCH options

# do not export environment variables
unset SLURM_EXPORT_ENV
# jobs always start in submit directory

module load gromacs/2020.1-mkl

### 1) The argument of -maxh should match the requested walltime! 
### 2) Performance often can be optimized if -npme # with a proper number of pme tasks is specified; 
###    experiment of use tune_mpe to find the optimal value. 
###    Using the SMT threads can sometimes be beneficial, however, requires testing.
### 3) Number of openMP threads also has to be tested beforehand and is limited by the number of pme tasks.
srun mdrun_mpi+OMP [-npme #] [-ntomp #] -maxh 6 -dlb yes -s my.tpr

parallel job on Fritz

#!/bin/bash -l
#SBATCH --job-name=my-gmx
#SBATCH --nodes=3
#SBATCH --ntasks-per-node=72
#SBATCH --partition=multinode
#SBATCH --cpus-per-task=1
#SBATCH --time=10:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV

module load gromacs/2021.5-gcc11.2.0-impi-mkl

srun gmx_mpi mdrun [-npme #] -maxh 9.5 [-ntomp #] -dlb yes -s my.tpr

single GPU job on TinyGPU (Torque)

#!/bin/bash -l
#PBS -lnodes=1:ppn=4,walltime=10:00:00
#PBS -N my-gmx
#PBS -j eo

cd $PBS_O_WORKDIR

module load gromacs/2019.3-mkl-CUDA101

### 1) the argument of -maxh should match the requested walltime!
### 2) optional arguments are: -pme gpu -npme 1
###                            -bonded gpu
###                            -update gpu
gmx mdrun -maxh 10 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

single GPU job on TinyGPU (Slurm)

#!/bin/bash -l
# requests 16 OpenMP threads 
#SBATCH --cpus-per-task=16
# allocate nodes for 6 hours 
#SBATCH --time=06:00:00 
# job name 
#SBATCH --job-name=Testjob 
# allocated one A100 GPGPU 
#SBATCH --gres=gpu:a100:1
#SBATCH --partition=a100
# do not export environment variables 
#SBATCH --export=NONE 

# do not export environment variables 
unset SLURM_EXPORT_ENV 

module load gromacs/2021.1-gcc-mkl-cuda11.2

### 1) the argument of -maxh should match the requested walltime!
### 2) optional arguments are: -pme gpu -npme 1
###                            -bonded gpu
###                            -update gpu
gmx mdrun -maxh 6 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

multiple GPUs job on TinyGPU (Torque)

The performance benefit of using multiple GPUs is often very low! You get much better throughout if you run multiple independent jobs on a single GPUs as shown above.

Even if using multiple GPUs do not use the MPI-parallel version (mdrun_mpi) but the thread-mpi version (gmx mdrun) of Gromacs. -ntmpi # usually should match the number of GPUs available.

#!/bin/bash -l
#PBS -lnodes=1:ppn=8,walltime=10:00:00
#PBS -N my-gmx
#PBS -j eo

cd $PBS_O_WORKDIR

module load gromacs/2021.1-gcc-mkl-cuda11.2

### 1) The argument of -maxh should match the requested walltime!
### 2) Typical optional arguments are: -pme gpu -npme 1
###                                    -bonded gpu

# these variables are needed for halo exchange and 
# optimized communication between the GPUs 
export GMX_GPU_DD_COMMS=true 
export GMX_GPU_PME_PP_COMMS=true 
export GMX_GPU_FORCE_UPDATE_DEFAULT_GPU=true
gmx mdrun -ntmpi 2 -ntomp 4 -maxh 10 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

multiple GPUs job on TinyGPU (Slurm)

The performance benefit of using multiple GPUs is often very low! You get much better throughout if you run multiple independent jobs on a single GPUs as shown above.

Even if using multiple GPUs do not use the MPI-parallel version (mdrun_mpi) but the thread-mpi version (gmx mdrun) of Gromacs. -ntmpi # usually should match the number of GPUs available.

#!/bin/bash -l
# requests 16 OpenMP threads 
#SBATCH --cpus-per-task=16 
# allocated one GPU 
#SBATCH --gres=gpu:gtx3080:1
# allocate nodes for 6 hours 
#SBATCH --time=06:00:00 
# job name #SBATCH --job-name=Testjob 
# do not export environment variables 
#SBATCH --export=NONE 

# do not export environment variables 
unset SLURM_EXPORT_ENV 

module load gromacs/2021.1-gcc-mkl-cuda11.2

### 1) The argument of -maxh should match the requested walltime!
### 2) Typical optional arguments are: -pme gpu -npme 1
###                                    -bonded gpu

# these variables are needed for halo exchange and 
# optimized communication between the GPUs 
export GMX_GPU_DD_COMMS=true 
export GMX_GPU_PME_PP_COMMS=true 
export GMX_GPU_FORCE_UPDATE_DEFAULT_GPU=true
gmx mdrun -ntmpi 4 -ntomp 4 -maxh 6 -s my.tpr

### try automatic restart (adapt the conditions to fit your needs)
if [ -f confout.gro ]; then
   echo "*** confout.gro found; no re-submit required"
   exit
if [ $SECONDS -lt 1800 ]; then
   echo "*** no automatic restart as runtime of the present job was too short"
   exit
fi
qsub $0

multiple walker metadynamic on multiple GPUs on TinyGPU (Slurm)

This is an example script for running a meta-dynamic simulation with 32 walkers with Gromacs patched with Plumed on eight of our RTX3080 GPUs. Transfer to other GPU hardware is possible, but may require adjustment of settings (e.g. MPS-server [y/n], flags for mpirun and Gromacs program flags).

Please note: The run-input-file (*.tpr) for each walker needs to be in its own directory and it must be given the same name inside that directory.

#!/bin/bash -l
# requests 16 OpenMP threads 
#SBATCH --cpus-per-task=16 
# allocated one GPU 
#SBATCH --gres=gpu:gtx3080:8
# allocate nodes for 6 hours 
#SBATCH --time=06:00:00 
# job name 
#SBATCH --job-name=Testjob 
# do not export environment variables 
#SBATCH --export=NONE 

# do not export environment variables 
unset SLURM_EXPORT_ENV 

module load gromacs/2021.1-gcc-mkl-cuda11.2

TPR=name

# not necessary, but makes sure the directories are in correct order
directories=`echo dir{0..9} dir{1..2}{0..9} dir3{0..1}`

# these variables are needed to start the MPS-server
# Select a location that’s accessible to the given $UID
export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps.$SLURM_JOB_ID
export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log.$SLURM_JOB_ID
# Start the daemon.
nvidia-cuda-mps-control -d 

# these variables need to be placed directly before the Gromacs invocation
# these variables are needed for halo exchange and 
# optimized communication between the GPUs 
export GMX_GPU_DD_COMMS=true 
export GMX_GPU_PME_PP_COMMS=true 
export GMX_GPU_FORCE_UPDATE_DEFAULT_GPU=true

# --oversubscribe is necessary, otherwise mpirun aborts
# -s is needed, otherwise gromacs complains
# -pme -nb -update -bonded make sure everything is offloaded to the GPU
# -pin -pinstride order the threads on the CPU, otherwise there's 
#  wild chaos on the CPU
# -plumed ../plumed_in.dat needs to point to where the file is relative 
#  to the directory the .tpr is in

mpirun -np 32 --oversubscribe gmx_mpi mdrun -s $TPR -pme gpu -nb gpu -update gpu -bonded gpu -pin on -pinstride 1 -plumed ../plumed_in.dat -multidir ${directories} -cpi $TPR -maxh 6

# this will stop the MPS-server
echo quit | nvidia-cuda-mps-control

Further information

  • https://manual.gromacs.org/documentation/current/
  • https://doi.org/10.1002/jcc.26011 – More bang for your buck: Improved use of GPU nodes for GROMACS 2018
  • our own evaluation – Multi-GPU Gromacs Jobs on TinyGPU – Gromacs Shootout: Intel Xeon Ice Lake vs. NVIDIA A100, A40, and others – Gromacs performance on different GPU types

Mentors

  • Dr. A. Kahler, RRZE, hpc-support@fau.de
  • AG Böckmann (Professur für Computational Biology, NatFak)
Erlangen National High Performance Computing Center (NHR@FAU)
Martensstraße 1
91058 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • How to find us
Up