Fritz parallel cluster (NHR+Tier3)
FAU’s Fritz cluster (system integrator: Megware) is a high-performance compute resource with high speed interconnect, i.e., a parallel computer. It is intended for multi-node parallel workloads. Fritz serves for both, FAU’s basic Tier3 resources as well as NHR’s project resources.
- 4 front end nodes with the same CPUs as the compute nodes but 512 GB of RAM, and 100 GbE connection to RRZE’s network backbone.
- 1 visualization node with the same CPUs as the compute nodes but 1024 GB of RAM, one Nvidia A16 GPU, 30 TB of local NVMe SSD storage, and 100 GbE connection to RRZE’s network backbone. Contact us if you have a need for remote visualization!
- 992 compute nodes with direct liquid cooling (DLC), each with two Intel Xeon Platinum 8360Y “Ice Lake” processors (36 cores per chip) running at a base frequency of 2.4 GHz and 54 MB Shared L3 cache per chip, 256 GB of DDR4 RAM.
- 64 huge-memory compute nodes with direct liquid cooling (DLC), each with two Intel Xeon Platinum 8470 “Sapphire Rapids” processors (52 cores per chip) running at a base frequency of 2.0 GHz and 105 MB Shared L3 cache per chip. 48 of these nodes have 1 TB of DDR5 RAM, while 16 of these nodes even have 2 TB of DDR5 RAM.
These huge-memory compute nodes are NOT available within the free Tier3 service. NHR projects are entitled upon explicit request to hpc-support@fau.de. Please provide a short motivation why you need the huge memory.
- Lustre-based parallel filesystem with a capacity of about 3,5 PB and an aggregated parallel I/O bandwidth of > 20 GB/s.
- Blocking HDR100 Infiniband with up to 100 GBit/s bandwidth per link and direction. There are islands with 64 nodes (i.e. 4.608 cores). The blocking factor between islands is 1:4.
- Measured LINPACK performance of 1.84 PFlop/s on 512 nodes in April 2022,
2.233 PFlop/s on 612 nodes in May 2022 resulting in place 323 of the June 2022 Top500 list, and
3.578 PFlop/s on 986 nodes in November 2022 resulting in place 151 of the November 2022 Top500 list.
The name “Fritz” is a play with the name of FAU’s founder Friedrich, Margrave of Brandenburg-Bayreuth (1711-1763).
Fritz has been financed by:
- German Research Foundation (DFG) as part of INST 90/1171-1 (440719683),
- NHR funding of federal and state authorities (BMBF and Bavarian State Ministry of Science and the Arts, respectively),
- eight of the huge-memory nodes are dedicated to HS Coburg as part of the BMBF proposal “HPC4AAI” within the call “KI-Nachwuchs@FH”,
- and financial support of FAU to strengthen HPC activities.
This website shows information regarding the following topics:
Access, User Environment, and File Systems
Access to the machine
Note that FAU HPC accounts are not automatically enabled for Tier3 access to Fritz. To request Tier3 access to Fritz, you need to work on a project with extended demands, thus, not feasible on Woody/Meggie, but sill below the NHR thresholds. You have proof that and provide a short description of what you want to do there: https://hpc.fau.de/tier3-access-to-fritz/.
The rules for NHR access are described on our page on NHR application rules.
Users can connect to fritz.nhr.fau.de
(keep the “nhr” instead of “rrze” in mind!) and will be randomly routed to one of the four front ends. All systems in the cluster, including the front ends, have private IPv4 addresses in the 10.28.64.0/20
and IPv6 addresses in the 2001:638:a000:3964::/64
range. They can normally only be accessed directly from within the FAU networks. There is one exception: If your internet connection supports IPv6, you can directly ssh to the front ends (but not to the compute nodes). Otherwise, if you need access from outside of FAU, you usually have to connect for example to the dialog server cshpc.rrze.fau.de
first and then ssh to fritz.nhr.fau.de
from there. For HPC protal/NHR users we provide a template that can be added to the local .ssh/config
.
SSH public host keys of the (externally) reachable hosts
ssh-dss AAAAB3NzaC1kc3MAAACBAJj7qaKbgbgtUCB8iXOzCY8Vs0ZbdnlZsNDdKZct3vOzt2B4js2yBEGs0Fsmvjy88ro33TeI1JjMsnNP6T4wmeNPIGLkUvtX2fPcLL3c9WbVkeOf3R2b5VMRIJ+l3rVgwvHihBJFcgaAQO/mB75hdtzY6Pk5cReVYR2uidD3HkKtAAAAFQC1cIYQ0jUUXnJeeMG/t/8muhFaxQAAAIBrGWK0GhIAFu9cSy/pVXiJP7rJrSvsDGGdZ1HgB7GEjeN4O9ZFZS+vj2m4ij0LjOVtulf9LDJR56fA+3Jpjcu32/L7IPCm5/nqJd7An9/xt8D+tUPhOZfRugol9f6tV/oDRI3Y7rMDjChpjpkuN9bP2vshveHLlA0WB9Lqdgu2fgAAAIBKS/RFirbOnuP38OJ6mTXLeSlNsLEs+zW+vHhL5a08MXrAUQHYUwZplH2bNQpMyeRH55UoRJC0XDHpJzW8yafcwpO6k7uL1CWi3Gnhya9EbX2GIe8cYRrhYhcO+0M8UrVXKksVVWyAfkXZsIjTQCEcsCNhl0no5xS0/yOB6b6WzQ== fritz.nhr.fau.de ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBxc984fY6EkjlFQBFTtu/9X9EolCSz1OHNzaa8VWBj5TxV9GF8RTBXJ6why2AdK3dVrv+Qyko+X5vsMMflEiRc= fritz.nhr.fau.de ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILqv3FDYom0c4HgfCzLw9Ts2PE0GYqWaaOrM9EfQxvTI fritz.nhr.fau.de ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDLCrYuIIKhA+F4hktnR4VKAZ44J6CWIMfC9mCei80YZ0294kFwpYQg3RYRLHSyL6XqLgxaZN2kFm0V0NpUpEYSP2V9eWpuyfeB6a3M1I8yy4rDagHgWuHYuX2fSm8uwfndnJJ6hV1xfZuoZrZJIMkdy8qBl4y1cxn8G6CS1KEFkeJp7wMuIdIruFbJa5eQXVgAxaqKPQYRldpK8c1OAByfQv9cBXF53cNZhtlwkUes6/PqNyU1aIodfahdYh6mxn/4Rzy+NMD0YS066P8xWP1n+bsTBpZ51pH7qTIiW1yRKFmAeFvkWVnS6N5qIwJnzB3J7DRUue9h1EhW4HCo6CEX3GCOt0kuV4ax0JgYO/Lz0cUTdDcgkWOpVtQ+WyLNUech+TsOREn19QjaK9QRriBOvBcNCnbpBXHZSqfOYGB6uggkVjyDPI6S5pclt544ie6pklAOSzrha5CLnzD4U8oVuqhFHteO39qXpbvxkUDuNsDf9t8K5fmWgCXXWtJSVhE= fritz.nhr.fau.de
fingerprints of the SSH public host keys of fritz.nhr.fau.de (as of 11/2021)
1024 SHA256:1jZOatgzkjn7G3b1/K48T+cQ8MUB60oU7j+CK5FWId8 fritz.nhr.fau.de (DSA) 256 SHA256:2NzGq1vTO//RNtVLEGrp5yMyPHmtAw6nguSxcFUHHWU fritz.nhr.fau.de (ECDSA) 256 SHA256:5km4SnsTbyBG6gX1y11imEEU8QKP8EbrqFNPce1eEU4 fritz.nhr.fau.de (ED25519) 3072 SHA256:p7+HzbUyVSjYh2hx2nQcrIuWnZZKJhnNUoI9kz+Q4yw fritz.nhr.fau.de (RSA)
SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)
ssh-dss AAAAB3NzaC1kc3MAAACBAO2L8+7bhJm7OvvJMcdGSJ5/EaxvX5RRzE9RrB8fx5H69ObkqC6Baope4rOS9/+2gtnm8Q3gZ5QkostCiKT/Wex0kQQUmKn3fx6bmtExLq8YwqoRXRmNTjBIuyZuZH9w/XFK36MP63p/8h7KZXvkAzSRmNVKWzlsAg5AcTpLSs3ZAAAAFQCD0574+lRlF0WONMSuWeQDRFM4vwAAAIEAz1nRhBHZY+bFMZKMjuRnVzEddOWB/3iWEpJyOuyQWDEWYhAOEjB2hAId5Qsf+bNhscAyeKgJRNwn2KQMA2kX3O2zcfSdpSAGEgtTONX93XKkfh6JseTiFWos9Glyd04jlWzMbwjdpWvwlZjmvPI3ATsv7bcwHji3uA75PznVUikAAACBANjcvCxlW1Rjo92s7KwpismWfcpVqY7n5LxHfKRVqhr7vg/TIhs+rAK1XF/AWxyn8MHt0qlWxnEkbBoKIO5EFTvxCpHUR4TcHCx/Xkmtgeq5jWZ3Ja2bGBC3b47bHHNdDJLU2ttXysWorTXCoSYH82jr7kgP5EV+nPgwDhIMscpk cshpc.rrze.fau.de ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNVzp97t3CxlHtUiJ5ULqc/KLLH+Zw85RhmyZqCGXwxBroT+iK1Quo1jmG6kCgjeIMit9xQAHWjS/rxrlI10GIw= cshpc.rrze.fau.de ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPSIFF3lv2wTa2IQqmLZs+5Onz1DEug8krSrWM3aCDRU cshpc.rrze.fau.de 1024 35 135989634870042614980757742097308821255254102542653975453162649702179684202242220882431712465065778248253859082063925854525619976733650686605102826383502107993967196649405937335020370409719760342694143074628619457902426899384188195801203193251135968431827547590638365453993548743041030790174687920459410070371 cshpc.rrze.fau.de ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAs0wFVn1PN3DGcUtd/JHsa6s1DFOAu+Djc1ARQklFSYmxdx5GNQMvS2+SZFFa5Rcw+foAP9Ks46hWLo9mOjTV9AwJdOcSu/YWAhh+TUOLMNowpAEKj1i7L1Iz9M1yrUQsXcqDscwepB9TSSO0pSJAyrbuGMY7cK8m6//2mf7WSxc= cshpc.rrze.fau.de
fingerprints of the SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)
1024 SHA256:A82eA7py46zE/TrSTCRYnJSW7LZXY16oOBxstJF3jxU cshpc.rrze.fau.de (DSA) 256 SHA256:wFaDywle3yJvygQ4ZAPDsi/iSBTaF6Uoo0i0z727aJU cshpc.rrze.fau.de (ECDSA) 256 SHA256:is52MRsxMgxHFn58o0ZUh8vCzIuE2gYanmhrxdy0rC4 cshpc.rrze.fau.de (ED25519) 1024 SHA256:Za1mKhTRFDXUwn7nhPsWc7py9a6OHqS2jin01LJC3ro cshpc.rrze.fau.de (RSA)
While it is possible to ssh directly to a compute node, users are only allowed to do this while they have a batch job running there. When all batch jobs of a user on a node have ended, all of their processes, including any open shells, will be killed automatically.
Software environment
The login and compute nodes run AlmaLinux8 (which is basically Redhat Enterprise Linux 8 without the support).
The login shell for all users on Fritz is always bash
and cannot be changed.
As on many other HPC systems, environment modules are used to facilitate access to software packages. Type “module avail
” to get a list of available packages. Even more packages will become visible once one of the 000-all-spack-pkgs
modules has been loaded. Most of the software is installed using “Spack“ as enhanced HPC package manager.
General notes on how to use certain software on our systems (including in some cases sample job scripts) can be found on the Special applications, and tips & tricks pages. Specific notes on how some software provided via modules on the Fritz cluster has been compiled, can be found in the following accordion:
Intel tools (compiler, MPI, MKL, TBB)
The modules intel
(and the Spack internal intel-oneapi-compilers
) provides the legacy Intel compilers icc
, icpc
, and ifort
as well as the new LLVM-based ones (icx
, icpx
, dpcpp
, ifx
).
Recommended compiler flags are: -O3 -xHost
If you want to enable full-width AVX512 SIMD support you have to additionally set the flag: -qopt-zmm-usage=high
The modules intelmpi
(and the Spack internal intel-oneapi-mpi
) provides Intel MPI. To use the legacy Intel compilers with Intel MPI you just have to use the appropriate wrappers with the Intel compiler names, i.e. mpiicc
, mpiicpc
, mpiifort
. To use the new LLVM-based Intel compilers with Intel MPI you have to specify them explicitly, i.e use mpiicc -cc=icx
, mpiicpc -cxx=icpx
, or mpiifort -fc=ifx
. The execution of mpicc
, mpicxx
, and mpif90
results in using the GNU compilers.
The modules mkl
and tbb
(and the Spack internal intel-oneapi-mkl
, and intel-oneapi-tbb
) provide Intel MKL and TBB. Use Intel’s MKL link line advisor to figure out the appropriate command line for linking with MKL. The Intel MKL also includes drop-in wrappers for FFTW3.
Further Intel tools may be added in the future.
The Intel modules on Fritz, Alex and the Slurm-TinyGPU/TinyFAT behave different than on the older RRZE systems: (1) The intel64
module has been renamed to intel
and no longer automatically loads intel-mpi
and mkl
. (2) intel-mpi/VERSION-intel
and intel-mpi/VERSION-gcc
have been unified into intel-mpi/VERSION
. The selection of the compiler occurs by the wrapper name, e.g. mpicc
= GCC, mpiicc
= Intel; mpif90
= GFortran; mpiifort
= Intel.
GNU compiler (gcc/g++/gfortran)
The GPU compilers are available in the version coming with the operating system (currently 8.5.0) as well as modules (currently versions 9.4, 10.3, and 11.2).
Recommended compiler flags are: -O3 -xHost
If you want to enable full-width AVX512 SIMD support you have to additionally set the flag: -qopt-zmm-usage=high
Open MPI
srun
instead of mpirun
is recommended. Open MPI is built using Spack:
- with the compiler mentioned in the module name; the corresponding compiler will be loaded as dependency when the Open MPI modules is loaded
- without support for thread-multiple
- with fabrics=ucx
- with support for Slurm as scheduler (and internal PMIx of Open MPI)
Python and conda environments
Do not rely on the Python installation from the operating system. Use our python
modules instead. These installations will be updated in place from time to time. We can add further packages from the Miniconda distribution as needed.
You can modify the Python environment as follows:
Set the location where pip and conda install packages to $WORK
, see Python and Jupyter for details. By default packages will be installed in $HOME
, which has limited capacity.
Extend the base environment
$ pip install --user <packages>'
Create a new one of your own
$ conda create -n <environment_name> <packages>'
Clone and modify this environment
$ conda create --name myclone --clone base
$ conda install --name myclone new_package
See also https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html.
GDB - GNU Project debugger
When using gdb -p <pid
> (or the equivalent attach <pid>
command in gdb) to attach to a process running in a SLURM job, you might encounter errors or warnings related to executable and library files than cannot be opened. Such issues will also prevent symbols from being resolved correctly, making debugging really difficult.
The reason that this happens is that processes in a SLURM job get a slightly different view of file system mounts (using a so-called namespace). When you want to attach GDB to a running process and use SSH to log into the node where the process is running, the gdb
process will not be in the same namespace, causing GDB to have issues to directly access the binary (and its libraries) you’re trying to debug.
The workaround is to use a slightly different method for attaching to the process:
$ gdb <executable>
(gdb) set sysroot /
(gdb) attach <pid>
(Thanks to our colleagues at SURFsara for figuring this out!)
Arm DDT
Arm DDT is a powerful parallel debugger. NHR@FAU holds a license for 32 processes.
Amber
NHR@FAU holds a “compute center license” of Amber, thus, Amber is generally available to everyone for non-profit use, i.e. for academic research.
Amber usually delivers the most economic performance using GPGPUs. Thus, the Alex GPGPU cluster might be a better choice.
Gromacs
Gromacs often delivers the most economic performance if GPGPUs are used. Thus the Alex GPGPU cluster might be a better choice.
If running on Fritz, it is mandatory in most cased to optimizes the number of PME processes experimentally. “pme_tune” REQUIRES FURTHER WORK AS A NON-MPI BINARY HAS TO BE USED
TODO: How to exactly run gmx pme_tune
…
Do not start gmx mdrun
with the option -v
. The verbose output will only create extra large Slurm stdout files and your jobs will suffer if the NFS servers have high load. There is also only very limited use to see in the stdout all the time when the job is expected to reach the specified number of steps.
LAMMPS
lammps/20211027-gcc11.2.0-ompi-mkl
has been compiled using Gcc-11.2.0, Open MPI 4.1.1, and Intel OneAPI MKL using
- -DBUILD_SHARED_LIBS:BOOL=ON -DLAMMPS_EXCEPTIONS:BOOL=OFF -DBUILD_MPI=ON -DBUILD_OMP:BOOL=ON -DPKG_OPENMP=ON -DPKG_GPU=OFF -DBUILD_LIB=ON -DWITH_JPEG:BOOL=ON -DWITH_PNG:BOOL=ON -DWITH_FFMPEG:BOOL=ON -DPKG_ASPHERE=ON -DPKG_BODY=ON -DPKG_CLASS2=ON -DPKG_COLLOID=ON -DPKG_COMPRESS=ON -DPKG_CORESHELL=ON -DPKG_DIPOLE=ON -DPKG_GRANULAR=ON -DPKG_KSPACE=ON -DPKG_KOKKOS=ON -DPKG_LATTE=ON -DPKG_MANYBODY=ON -DPKG_MC=ON -DPKG_MEAM=OFF -DPKG_MISC=ON -DPKG_MLIAP=OFF -DPKG_MOLECULE=ON -DPKG_MPIIO=ON -DPKG_OPT=OFF -DPKG_PERI=ON -DPKG_POEMS=ON -DPKG_PYTHON=ON -DPKG_QEQ=ON -DPKG_REPLICA=ON -DPKG_RIGID=ON -DPKG_SHOCK=ON -DPKG_SNAP=ON -DPKG_SPIN=ON -DPKG_SRD=ON -DPKG_USER-ATC=ON -DPKG_USER-ADIOS=OFF -DPKG_USER-AWPMD=OFF -DPKG_USER-BOCS=OFF -DPKG_USER-CGSDK=OFF -DPKG_USER-COLVARS=OFF -DPKG_USER-DIFFRACTION=OFF -DPKG_USER-DPD=OFF -DPKG_USER-DRUDE=OFF -DPKG_USER-EFF=OFF -DPKG_USER-FEP=OFF -DPKG_USER-H5MD=ON -DPKG_USER-LB=ON -DPKG_USER-MANIFOLD=OFF -DPKG_USER-MEAMC=ON -DPKG_USER-MESODPD=OFF -DPKG_USER-MESONT=OFF -DPKG_USER-MGPT=OFF -DPKG_USER-MISC=ON -DPKG_USER-MOFFF=OFF -DPKG_USER-NETCDF=ON -DPKG_USER-OMP=ON -DPKG_USER-PHONON=OFF -DPKG_USER-PLUMED=OFF -DPKG_USER-PTM=OFF -DPKG_USER-QTB=OFF -DPKG_USER-REACTION=OFF -DPKG_USER-REAXC=ON -DPKG_USER-SDPD=OFF -DPKG_USER-SMD=OFF -DPKG_USER-SMTBQ=OFF -DPKG_USER-SPH=OFF -DPKG_USER-TALLY=OFF -DPKG_USER-UEF=OFF -DPKG_USER-YAFF=OFF -DPKG_VORONOI=ON -DPKG_KIM=ON -DFFT=MKL -DEXTERNAL_KOKKOS=ON
Run module avail lammps
to see all currently installed Lammps modules. Allocate an interactive job and run mpirun -np 1 lmp -help
to see which Lammps packages have been included in a specific build.
NAMD
NAMD comes with a license which prohibits us to “just install and everyone can use it”. We, therefore, need individual users to print and sign the NAMD license. Subsequently, we will set the permissions accordingly.
TODO – no module yet
At the moment, we provide the official pre-built Linux-x86_64-multicore binary.
VASP
VASP comes with a license which prohibits us to “just install and everyone can use it”. We have to individually check each VASP user.
At the moment we provide VASP 5.4.x and VASP 6.3.x modules to eligible users.
The module vasp6/6.3.0-hybrid-intel-impi-AVX2-with-addons
includes DFTD4, libbeef, and sol_compat/VASPsol.
Feel free to compile software in the versions and with the options you need yourself. This is perfectly fine, yet support for self-installed software cannot be granted. We only can provide software centrally which is of importance for multiple groups. If you want to use Spack for compiling additional software, you can load our user-spack
module to make use of the packages we already build with Spack if the concretization match instead of starting from scratch. Once user-spack
is loaded, the command spack
will be available (as alias), you will inherit the pre-sets we defined for certain packages (e.g. Open MPI to work with Slurm), but you’ll install everything into your own directories ($WORK/USER-SPACK
).
You can also bring your own environment in a container using Singularity/Apptainer.
File Systems
The following table summarizes the available file systems and their features. It is only an excerpt from the description of the HPC file system.
Mount point | Access via | Purpose | Technology, size | Backup | Data lifetime | Quota |
---|---|---|---|---|---|---|
/home/hpc |
$HOME |
Storage of source, input and important results | NFS on central servers, small | YES + Snapshots | Account lifetime | YES (restrictive) |
/home/vault |
$HPCVAULT |
Medium- to long-term high-quality storage | central servers | YES + Snapshots | Account lifetime | YES |
/home/{woody, saturn, titan, janus, atuin} |
$WORK |
Short- to medium-term storage or small files | central NFS server | NO | Account lifetime | YES |
/lustre |
$FASTTMP |
High performance parallel I/O; short-term storage | Lustre-based parallel file system, 3.5 PB | NO | High watermark deletion | NO |
NFS file system $HOME
When connecting to one of the front end nodes, you’ll find yourself in your regular HPC $HOME
directory (/home/hpc/...
). There are relatively tight quotas there, so it will most probably be too small for the inputs/outputs of your jobs. It, however, does offer a lot of nice features, like fine-grained snapshots, so use it for “important” stuff, e.g. your job scripts, or the source code of the program you’re working on. See the HPC file system page for a more detailed description of the features.
Quota in $HOME
is very limited as snapshots are made every 30 minutes. Put simulation data to $WORK
! Do not rely on the specific path of $WORK
as this may change over time when your work directory is relocated to a different NFS server.
Parallel file system $FASTTMP
The cluster’s parallel file system is mounted on all nodes under /lustre/$GROUP/$USER/
and available via the $FASTTMP
environment variable. It supports parallel I/O using the MPI-I/O functions and can be accessed with an aggregate bandwidth of > 20 GB/s.
The parallel file system is strictly intended to be a high-performance short-term storage, so a high watermark deletion algorithm is employed: When the filling of the file system exceeds a certain limit (e.g. 80%), files will be deleted starting with the oldest and largest files until a filling of less than 60% is reached.
Note that parallel filesystems generally are not made for handling large amounts of small files. This is by design: Parallel filesystems achieve their amazing speed by writing to multiple different servers at the same time. However, they do that in blocks, in our case 1 MB. That means that for a file that is smaller than 1 MB, only one server will ever be used, so the parallel filesystem can never be faster than a traditional NFS server – on the contrary: due to larger overhead, it will generally be slower. They can only show their strengths with files that are at least a few megabytes in size, and excel if very large files are written by many nodes simultaneous (e.g. checkpointing).
For that reason, we have set a limit on the number of files you can store there.
Batch processing
As with all production clusters at RRZE, resources are controlled through a batch system. The front ends can be used for compiling and very short serial test runs, but everything else has to go through the batch system to the cluster.
Fritz uses SLURM as a batch system. Please see our general batch system description for further details.
The granularity of batch allocations are complete nodes, i.e. nodes are never shared. As a parallel computer, Fritz is not made for single-node jobs as a lot of money was spent for the fast HDR100 interconnect.
Partition | min – max walltime | min – max nodes | availability | Comments |
---|---|---|---|---|
singlenode (there is no need to specify the partition explicitly) |
0 – 24:00:00 | 1 | Tier3&NHR | Jobs run on nodes without Infiniband; nodes are exclusive |
multinode (there is no need to specify the partition explicitly) |
0 – 24:00:00 | 1-64 | Tier3&NHR | Jobs run on nodes with Infiniband; nodes are exclusive |
spr1tb (the partition has to be specified explicitly) |
0 – 24:00:00 | 1-8 | NHR-only | huge memory nodes with (at least) 1 TB main memory; access is restricted – NHR projects can contact hpc-support@fau.de to get enabled. Please provide a short motivation why you need the huge memory. |
spr2tb (the partition has to be specified explicitly) |
0 – 24:00:00 | 1-2 | NHR-only | huge memory nodes with 2 TB main memory; access is restricted – NHR projects can contact hpc-support@fau.de to get enabled. Please provide a short motivation why you need the huge memory. |
Interactive job (single-node)
Interactive jobs can be requested by using salloc
instead of sbatch
and specifying the respective options on the command line.
The following will give you an interactive shell on one node for one hour:
salloc -N 1 --partition=singlenode --time=01:00:00
Settings from the calling shell (e.g. loaded module paths) will be inherited by the interactive job!
Interactive job (multi-node)
Interactive jobs can be requested by using salloc
instead of sbatch
and specifying the respective options on the command line.
The following will give you four nodes with an interactive shell on the first node for one hour:
salloc -N 4 --partition=multinode --time=01:00:00
Settings from the calling shell (e.g. loaded module paths) will be inherited by the interactive job!
MPI parallel job (single-node)
In this example, the executable will be run on one node, using 72 MPI processes, i.e. one per physical core.
#!/bin/bash -l #SBATCH --nodes=1 #SBATCH --ntasks-per-node=72 #SBATCH --partition=singlenode #SBATCH --time=01:00:00 #SBATCH --export=NONE unset SLURM_EXPORT_ENV module load XXX srun ./mpi_application
OpenMP job (single-node)
In this example, the executable will be run using 72 OpenMP threads (i.e. one per physical core) for a total job walltime of 1 hour.
For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores
, OMP_PROC_BIND=true
. For more information, see e.g. the HPC Wiki.
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=72
#SBATCH --partition=singlenode
#SBATCH --time=01:00:00
#SBATCH --export=NONE
unset SLURM_EXPORT_ENV
module load XXX
# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
./openmpi_application
Hybrid OpenMP/MPI job (single-node)
In this example, the executable will be run using 2 MPI processes with 36 OpenMP threads (i.e. one per physical core) for a total job walltime of 1 hour.
For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores
, OMP_PROC_BIND=true
. For more information, see e.g. the HPC Wiki.
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --partition=singlenode
#SBATCH --time=1:00:00
#SBATCH --export=NONE
unset SLURM_EXPORT_ENV
module load XXX
# cpus-per-task has to be set again for srun
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun ./hybrid_application
MPI parallel job (multi-node)
In this example, the executable will be run on four nodes, using 72 MPI processes per node, i.e. one per physical core.
#!/bin/bash -l
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=72
#SBATCH --partition=multinode
#SBATCH --time=1:0:0
#SBATCH --export=NONE
unset SLURM_EXPORT_ENV
module load XXX
srun ./mpi_application
Hybrid OpenMP/MPI job (multi-node)
In this example, the executable will be run using on four nodes with 2 MPI processes per node and 36 OpenMP threads each (i.e. one per physical core) for a total job walltime of 1 hour.
For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores
, OMP_PROC_BIND=true
. For more information, see e.g. the HPC Wiki.
#!/bin/bash -l
#SBATCH --partition=multinode
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --time=01:00:00
#SBATCH --export=NONE
unset SLURM_EXPORT_ENV
module load XXX
# cpus-per-task has to be set again for srun
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun ./hybrid_application
Further Information
Intel Xeon Platinum 8360Y “IceLake” Processor
Hyperthreading (SMT) is disabled; sub-NUMA clustering (Cluster-on-Die, CoD) is activated. This results in 4 NUMA domains with 18 cores each per compute node.
The processor can be operated in 3 modes; in Fritz it’s running in its default mode with 36 cores and 250 W TDP.
Launch Date | Q2’21 |
Lithography | 10 nm |
Total Cores (Threads) | 36 (72 – SMT is disabled on Fritz) |
Max Turbo Frequency (non-AVX code) | 3.50 GHz (significantly lower for heavy AVX2/AVX512 workload) |
Processor Base Frequency (non-AVX code) | 2.40 GHz (significantly lower for heavy AVX2/AVX512 workload) |
Last level cache (L3) | 54 MB |
# of UPI Links | 3 |
TDP | 250 W |
Memory Channels & Memory Type | 8 channels DDR4 @ 3200 per socket (in Fritz: 16x 16 GB DRR4-3200 per node) |
Instruction Set Extensions | Intel SSE4.2, Intel AVX, Intel AVX2, Intel AVX-512 |
# of AVX-512 FMA Units | 2 |
See https://ark.intel.com/content/www/us/en/ark/products/212459/intel-xeon-platinum-8360y-processor-54m-cache-2-40-ghz.html for full processor details.
Intel Xeon Platinum 8470 “Sapphire Rapids” Processors
Hyperthreading (SMT) is disabled; sub-NUMA clustering (Cluster-on-Die, CoD) is activated. This results in 8 NUMA domains with 13 cores each per compute node.
Launch Date | Q1’23 |
Lithography | Intel 7 |
Total Cores (Threads) | 52 (104 – SMT is disabled on Fritz) |
Max Turbo Frequency (non-AVX code) | 3.80 GHz (significantly lower for heavy AVX2/AVX512 workload) |
Processor Base Frequency (non-AVX code) | 2.0 GHz (significantly lower for heavy AVX2/AVX512 workload) |
Last level cache (L3) | 105 MB |
# of UPI Links | 4 |
TDP | 350 W |
Memory Channels & Memory Type | 8 channels DDR5 @ 4400 MT/s with 2 DPC per socket (in Fritz: 32x 64/128 GB DRR4-4800 per node) |
Instruction Set Extensions | Intel SSE4.2, Intel AVX, Intel AVX2, Intel AVX-512 |
# of AVX-512 FMA Units | 2 |
Intel® On Demand Feature Activation | none |
See https://ark.intel.com/content/www/us/en/ark/products/231728/intel-xeon-platinum-8470-processor-105m-cache-2-00-ghz.html for full processor details.
Network topology
Fritz uses unmanaged 40 port Mellanox HDR switches. 8 HDR200 links per edge switch are connected to the spine level. Using splitter cables, 64 compute nodes are connected with HDR100 per each edge switch. This results in a 1:4 blocking of the fat tree. Each island with 64 nodes has a total of 4.608 cores. Slurm is aware of the topology, but minimizing the number of switches per jobs does not have a high priority.
Direct liquid cooling (DLC) of the compute nodes


ICX nodes:
- https://www.intel.com/content/www/us/en/products/details/servers/multi-node-server-systems/server-system-d50tnp.html
- https://www.intel.com/content/www/us/en/content-details/630955/intel-server-d50tnp-family-integration-and-service-guide-production-version.html
- https://www.intel.com/content/www/us/en/content-details/749227/intel-server-d50tnp-family-configuration-guide-production-version.html
- https://www.intel.com/content/www/us/en/content-details/647674/intel-server-d50tnp-family-technical-product-specification-production-version.html
SPR nodes:
- https://www.intel.com/content/www/us/en/products/sku/232185/intel-server-system-d50dnp1mhcplc-compute-module/specifications.html
- https://www.intel.com/content/www/us/en/content-details/766296/intel-server-d50dnp-family-integration-and-service-guide-1-0.html
- https://www.intel.com/content/www/us/en/content-details/765893/intel-server-d50dnp-family-technical-product-specification-1-0.html