Amber/AmberTools
Amber and AmberTools are suite of biomolecular simulation programs. Here, the term “Amber” does not refer to the set of molecular mechanical force fields for the simulation of biomolecules but to the package of molecular simulation programs consisting of the AmberTools (sander and many more) and Amber (pmemd).
AmberTools are open-source while Amber (pmemd) requires a license. NHR@FAU holds a “compute center license” of Amber, thus, Amber is generally available to everyone for non-profit use, i.e. for academic research.
Availability / Target HPC systems
- TinyGPU and Alex: typically use pmemd.cuda which uses a single GPU.
Thermodynamic integration (TI) may require special tuning; contact us! - Woody (throughput cluster) and parallel computers: use pmemd.MPI whenever possible and only choose sander.MPI if the former does not support your input.
New versions of Amber/AmberTools are installed by RRZE upon request.
Notes
Modules for CPU and GPU are called amber; the numbers in the module name specify the Amber version, Amber patch level, the AmberTools version, and the AmberTools patch level. The numbers are complemented by the used compilers/tools, e.g. amber/22p00-at22p03-gnu-cuda11.5
or amber/20p12-at21p11-impi-intel
. GPU module names contain the string cuda
, parallel CPU module names mpi
.
cpptraj is also available in parallel CPU versions (cpptraj.OMP and cpptraj.MPI) and for GPU (cpptraj.cuda). For resource-intensive analyses you may try these on the cluster as separate jobs.
pmemd and sander do not have internal measures to limit the run time. Thus, you have to estimate the number of time steps which can finish within the requested wall time before hand and use that in your mdin file. Keep in mind the maximum wall time of 24h (cf sample input for chain job below).
Dumping intermediate results to the hard disk systems slows down your job, maybe needlessly: Estimate how many snapshots you want to analyze from your simulation in order to obtain meaningful results and adjust your trajectory output frequency accordingly; adapt the frequency of writing to the output file to this value, unless you need the information; set the frequency for restart file output to a value so that you really would use this file for a restart in case of a crash and not simply restart the complete calculation – in this case writing the restart file at the end of your simulation would be sufficient. Of course, all this also depends on your specific project needs.
GPU versions of Amber are deterministic with respect to the CPU card type: Using the same binary restart file with coordinates and velocities for restarting a calculation will yield the exact same trajectory on the same GPU card type (e.g. A40).
Amber benchmark calculations on Alex have shown that A40 cards are more economical (price performance ratio) than the A100 cards and are thus recommended for molecular dynamics simulations with Amber.
Originally, it has been suggested by the Amber authors to run the heating and (pressure) equalization steps on the CPU, but using the GPU version might also work. In case of problems, consider to split your simulations into more chunks, since a program restart also resets some global parameters and thus may better adapt to a rapidly changing system (e.g. box size).
Recent versions of AmberTools install their only version of Python, which is independent of the Python of the Linux distribution or the usual Python modules of RRZE.
Sample job scripts
pmemd on TinyGPU
#!/bin/bash -l #SBATCH --time=06:00:00 #SBATCH --job-name=Testjob #SBATCH --gres=gpu:1 #SBATCH --export=NONE unset SLURM_EXPORT_ENV module add amber-gpu/20p08-at20p12-gnu-cuda11.2 ### there is no need to fiddle around with CUDA_VISIBLE_DEVICES! pmemd.cuda -O -i mdin ...
pmemd on Alex
#!/bin/bash -l # #SBATCH --job-name=my-pmemd #SBATCH --ntasks=16 #SBATCH --time=06:00:00 # use gpu:a100:1 and partition=a100 for A100 #SBATCH --gres=gpu:a40:1 #SBATCH --partition=a40 #SBATCH --export=NONE unset SLURM_EXPORT_ENV module load amber/20p12-at21p11-gnu-cuda11.5 pmemd.cuda -O -i mdin -c inpcrd -p prmtop -o output
parallel pmemd on Meggie
#!/bin/bash -l # # allocate 4 nodes with 20 cores per node = 4*20 = 80 MPI tasks #SBATCH --nodes=4 #SBATCH --tasks-per-node=20 # # allocate nodes for 6 hours #SBATCH --time=06:00:00 # job name #SBATCH --job-name=my-pmemd # do not export environment variables #SBATCH --export=NONE # # first non-empty non-comment line ends SBATCH options # do not export environment variables unset SLURM_EXPORT_ENV # jobs always start in submit directory module load amber/20p03-at20p07-intel17.0-intelmpi2017 # run srun pmemd.MPI -O -i mdin ...
Sample chain job script for automatic resubmission (Alex, A40)
#!/bin/bash -l # Sample chain job script for automatic resubmission (Alex, A40) # # Required files in the current directory: # * this slurm file # * top file: topology -> <variable1> # * inp file: MD input -> <variable2> # * crd file: coordinates -> <variable3>_md.crd # * COUNT: Initial counter for MD-1, i.e. initially 0; echo 0 > COUNT # * MAXCOUNT: Final counter for MD, e.g. 10; echo 10 > MAXCOUNT # # For each simulation cycle n, a subdirectory "MDn" is created into which # the result files are moved after completion. The restart file is renamed, # and the topology file is kept in the current directory. # # Note: For extending a set of MAXCOUNT simulations, simply increase # the number in MAXCOUNT to the new maximum and resubmit this # slurm job file. # # SLURM settings #SBATCH --partition=a40 #SBATCH --gres=gpu:a40:1 #SBATCH --export=NONE # USER: Please adjust time and job-name #SBATCH --time=12:00:00 #SBATCH --job-name=XYZ unset SLURM_EXPORT_ENV # Change into the submit directory cd $SLURM_SUBMIT_DIR # Add your favorite AMBER module here module load amber/20p12-at21p11-gnu-cuda11.5 export SANDER=pmemd.cuda # USER: Please adjust file names for input, topology, and basis name # (YYY and ZZZ may be the same) inp=XXX.inp top=YYY.top bas=ZZZ_md echo "Starting AMBER calculation for $bas" $SANDER -O -i $inp -p $top -c ${bas}.crd \ -o ${bas}.out -r ${bas}.rst -e ${bas}.ene \ -x ${bas}.ndf -inf ${bas}.inf echo "Finished Amber calculation" # Get current simulation cycle number and increase it COUNT=`cat COUNT`; let COUNT++; echo $COUNT > COUNT # Save work in subdirectories MDn, keep restart file DD=MD${COUNT} echo Saving work in $DD mkdir $DD mv ${bas}.crd $DD mv ${bas}.out $DD mv ${bas}.ene $DD mv ${bas}.ndf $DD mv ${bas}.inf $DD cp ${bas}.rst $DD ; mv ${bas}.rst ${bas}.crd # Resubmit MAXCOUNT=`cat MAXCOUNT` if [[ $COUNT -ge MAXCOUNT ]]; then echo "Maximum number of invocations reached: $MAXCOUNT" else let COUNT++ ; newname=$bas-$COUNT echo "Resubmitting $0 with jobname $newname" sbatch --job-name $newname $0 fi sync # End of sample chain job script
Sample job script for HREMD (Alex, A40)
#!/bin/bash -l #SBATCH --gres=gpu:a40:4 #SBATCH --export=NONE #SBATCH --time=12:00:00 #SBATCH --job-name=my-hremd unset SLURM_EXPORT_ENV module load amber/20p12-at21p11-openmpi-gnu-cuda11.5 # number of replicas, e.g. 32 NG=32 # these variables are needed to start the MPS-server # Select a location that’s accessible to the given $UID export CUDA_MPS_PIPE_DIRECTORY=/tmp/nvidia-mps.$SLURM_JOB_ID export CUDA_MPS_LOG_DIRECTORY=/tmp/nvidia-log.$SLURM_JOB_ID # Start the daemon. nvidia-cuda-mps-control -d mpirun --oversubscribe -np $NG pmemd.cuda.MPI -O -ng $NG -groupfile groupfile -rem 3 # this will stop the MPS-server echo quit | nvidia-cuda-mps-control
Further information
- Amber benchmark suite results on Alex
- http://ambermd.org
- http://ambermd.org/doc12/Amber23.pdf
- http://ambermd.org/tutorials/
- http://ambermd.org/GPULogistics.php
- https://amberhub.chpc.utah.edu/
- https://www.exxactcorp.com/blog/Molecular-Dynamics/rtx3090-benchmarks-for-hpc-amber-a100-vs-rtx3080-vs-2080ti-vs-rtx6000
Mentors
- Dr. A. Kahler, RRZE, hpc-support@fau.de
- AG Sticht (Professur für Bioinformatik, MedFak)