• Skip navigation
  • Skip to navigation
  • Skip to the bottom
Simulate organization breadcrumb open Simulate organization breadcrumb close
  • FAUTo the central FAU website
  • RRZE
  • NHR-Verein e.V.
  • Gauß-Allianz

Navigation Navigation close
  • News
  • About us
    • People
    • Funding
    • NHR Compute Time Projects
    • Tier3 User Project Reports
    • Success Stories from the Support
    • Annual Report
    • Jobs
    Portal About us
  • Research
    • Research Focus
    • Publications, Posters and Talks
    • Software & Tools
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
    • NHR PerfLab Seminar
    • Projects
    • Awards
    Portal Research
  • Teaching & Training
    • Lectures and Seminars
    • Tutorials & Courses
    • Theses
    • HPC Café
    • Student Cluster Competition
    Portal Teaching & Training
  • Systems & Services
    • Systems, Documentation & Instructions
    • Support & Contact
    • Training Resources
    • Summary of System Utilization
    Portal Systems & Services
  • FAQ

  1. Home
  2. Systems & Services
  3. Systems, Documentation & Instructions
  4. HPC clusters & systems
  5. Fritz parallel cluster (NHR+Tier3)

Fritz parallel cluster (NHR+Tier3)

In page navigation: Systems & Services
  • Systems, Documentation & Instructions
    • Getting started with HPC
      • NHR@FAU HPC-Portal Usage
    • Job monitoring with ClusterCockpit
    • NHR application rules – NHR@FAU
    • HPC clusters & systems
      • Dialog server
      • Alex GPGPU cluster (NHR+Tier3)
      • Fritz parallel cluster (NHR+Tier3)
      • Meggie parallel cluster (Tier3)
      • Emmy parallel cluster (Tier3)
      • Woody(-old) throughput cluster (Tier3)
      • Woody throughput cluster (Tier3)
      • TinyFat cluster (Tier3)
      • TinyGPU cluster (Tier3)
      • Test cluster
      • Jupyterhub
    • SSH – Secure Shell access to HPC systems
    • File systems
    • Batch Processing
      • Job script examples – Slurm
      • Advanced topics Slurm
    • Software environment
    • Special applications, and tips & tricks
      • Amber/AmberTools
      • ANSYS CFX
      • ANSYS Fluent
      • ANSYS Mechanical
      • Continuous Integration / Gitlab Cx
        • Continuous Integration / One-way syncing of GitHub to Gitlab repositories
      • CP2K
      • CPMD
      • GROMACS
      • IMD
      • Intel MKL
      • LAMMPS
      • Matlab
      • NAMD
      • OpenFOAM
      • ORCA
      • Python and Jupyter
      • Quantum Espresso
      • R and R Studio
      • Spack package manager
      • STAR-CCM+
      • Tensorflow and PyTorch
      • TURBOMOLE
      • VASP
        • Request access to central VASP installation
      • Working with NVIDIA GPUs
      • WRF
  • Support & Contact
    • HPC Performance Lab
    • Atomic Structure Simulation Lab
  • HPC User Training
  • HPC System Utilization

Fritz parallel cluster (NHR+Tier3)

FAU’s Fritz cluster (system integrator: Megware) is a high-performance compute resource with high speed interconnect, i.e., a parallel computer. It is intended for multi-node parallel workloads. Fritz serves for both, FAU’s basic Tier3 resources as well as NHR’s project resources.

  • 4 front end nodes with the same CPUs as the compute nodes but 512 GB of RAM, and 100 GbE connection to RRZE’s network backbone.
  • 1 visualization node with the same CPUs as the compute nodes but 1024 GB of RAM, one Nvidia A16 GPU, 30 TB of local NVMe SSD storage, and 100 GbE connection to RRZE’s network backbone. Contact us if you have a need for remote visualization!
  • 992 compute nodes with direct liquid cooling (DLC), each with two Intel Xeon Platinum 8360Y “Ice Lake” processors (36 cores per chip) running at a base frequency of 2.4 GHz and 54 MB Shared L3 cache per chip, 256 GB of DDR4-RAM.
  • Lustre-based parallel filesystem with a capacity of about 3,5 PB and an aggregated parallel I/O bandwidth of > 20 GB/s.
  • Blocking HDR100 Infiniband with up to 100 GBit/s bandwidth per link and direction. There are islands with 64 nodes (i.e. 4.608 cores). The blocking factor between islands is 1:4.
  • Measured LINPACK performance of 1.84 PFlop/s on 512 nodes in April 2022,
    2.233 PFlop/s on 612 nodes in May 2022 resulting in place 323 of the June 2022 Top500 list, and
    3.578 PFlop/s on 986 nodes in November 2022 resulting in place 151 of the November 2022 Top500 list.

The name “Fritz” is a play with the name of  FAU’s founder Friedrich, Margrave of Brandenburg-Bayreuth (1711-1763).

Fritz has been financed by:

  • German Research Foundation (DFG) as part of INST 90/1171-1 (440719683),
  • NHR funding of federal and state authorities (BMBF and Bavarian State Ministry of Science and the Arts, respectively),
  • and financial support of FAU to strengthen HPC activities.

This website shows information regarding the following topics:

  • Access, User Environment, File Systems
    • Access to the machine
    • File systems
    • Batch processing
  • Further Information
    • Technical data Intel Xeon Platinum 8360Y “IceLake” processor
    • Network topology
    • Direct liquid cooling (DLC) of the compute nodes

Access, User Environment, and File Systems

Access to the machine

Note that FAU HPC accounts are not automatically enabled for Tier3 access to Fritz. To request Tier3 access to Fritz, you need to work on a project with extended demands, thus, not feasible on Woody/Meggie, but sill below the NHR thresholds. You have proof that and provide a short description of what you want to do there: https://hpc.fau.de/tier3-access-to-fritz/.

The rules for NHR access are described on our page on NHR application rules.

Users can connect to fritz.nhr.fau.de (keep the “nhr” instead of “rrze” in mind!) and will be randomly routed to one of the four front ends. All systems in the cluster, including the front ends, have private IPv4 addresses in the 10.28.64.0/20 and IPv6 addresses in the 2001:638:a000:3964::/64 range. They can normally only be accessed directly from within the FAU networks. There is one exception: If your internet connection supports IPv6, you can directly ssh to the front  ends (but not to the compute nodes). Otherwise, if you need access from outside of FAU, you usually have to connect for example to the dialog server cshpc.rrze.fau.de first and then ssh to fritz.nhr.fau.de from there. For HPC protal/NHR users we provide a template that can be added to the local .ssh/config.

SSH public host keys of the (externally) reachable hosts

SSH public host keys of fritz.nhr.fau.de (as of 11/2021)
ssh-dss AAAAB3NzaC1kc3MAAACBAJj7qaKbgbgtUCB8iXOzCY8Vs0ZbdnlZsNDdKZct3vOzt2B4js2yBEGs0Fsmvjy88ro33TeI1JjMsnNP6T4wmeNPIGLkUvtX2fPcLL3c9WbVkeOf3R2b5VMRIJ+l3rVgwvHihBJFcgaAQO/mB75hdtzY6Pk5cReVYR2uidD3HkKtAAAAFQC1cIYQ0jUUXnJeeMG/t/8muhFaxQAAAIBrGWK0GhIAFu9cSy/pVXiJP7rJrSvsDGGdZ1HgB7GEjeN4O9ZFZS+vj2m4ij0LjOVtulf9LDJR56fA+3Jpjcu32/L7IPCm5/nqJd7An9/xt8D+tUPhOZfRugol9f6tV/oDRI3Y7rMDjChpjpkuN9bP2vshveHLlA0WB9Lqdgu2fgAAAIBKS/RFirbOnuP38OJ6mTXLeSlNsLEs+zW+vHhL5a08MXrAUQHYUwZplH2bNQpMyeRH55UoRJC0XDHpJzW8yafcwpO6k7uL1CWi3Gnhya9EbX2GIe8cYRrhYhcO+0M8UrVXKksVVWyAfkXZsIjTQCEcsCNhl0no5xS0/yOB6b6WzQ== fritz.nhr.fau.de
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBBxc984fY6EkjlFQBFTtu/9X9EolCSz1OHNzaa8VWBj5TxV9GF8RTBXJ6why2AdK3dVrv+Qyko+X5vsMMflEiRc= fritz.nhr.fau.de
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAILqv3FDYom0c4HgfCzLw9Ts2PE0GYqWaaOrM9EfQxvTI fritz.nhr.fau.de
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABgQDLCrYuIIKhA+F4hktnR4VKAZ44J6CWIMfC9mCei80YZ0294kFwpYQg3RYRLHSyL6XqLgxaZN2kFm0V0NpUpEYSP2V9eWpuyfeB6a3M1I8yy4rDagHgWuHYuX2fSm8uwfndnJJ6hV1xfZuoZrZJIMkdy8qBl4y1cxn8G6CS1KEFkeJp7wMuIdIruFbJa5eQXVgAxaqKPQYRldpK8c1OAByfQv9cBXF53cNZhtlwkUes6/PqNyU1aIodfahdYh6mxn/4Rzy+NMD0YS066P8xWP1n+bsTBpZ51pH7qTIiW1yRKFmAeFvkWVnS6N5qIwJnzB3J7DRUue9h1EhW4HCo6CEX3GCOt0kuV4ax0JgYO/Lz0cUTdDcgkWOpVtQ+WyLNUech+TsOREn19QjaK9QRriBOvBcNCnbpBXHZSqfOYGB6uggkVjyDPI6S5pclt544ie6pklAOSzrha5CLnzD4U8oVuqhFHteO39qXpbvxkUDuNsDf9t8K5fmWgCXXWtJSVhE= fritz.nhr.fau.de

fingerprints of the SSH public host keys of fritz.nhr.fau.de (as of 11/2021)

1024 SHA256:1jZOatgzkjn7G3b1/K48T+cQ8MUB60oU7j+CK5FWId8 fritz.nhr.fau.de (DSA)
256  SHA256:2NzGq1vTO//RNtVLEGrp5yMyPHmtAw6nguSxcFUHHWU fritz.nhr.fau.de (ECDSA)
256  SHA256:5km4SnsTbyBG6gX1y11imEEU8QKP8EbrqFNPce1eEU4 fritz.nhr.fau.de (ED25519)
3072 SHA256:p7+HzbUyVSjYh2hx2nQcrIuWnZZKJhnNUoI9kz+Q4yw fritz.nhr.fau.de (RSA)

SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)

ssh-dss AAAAB3NzaC1kc3MAAACBAO2L8+7bhJm7OvvJMcdGSJ5/EaxvX5RRzE9RrB8fx5H69ObkqC6Baope4rOS9/+2gtnm8Q3gZ5QkostCiKT/Wex0kQQUmKn3fx6bmtExLq8YwqoRXRmNTjBIuyZuZH9w/XFK36MP63p/8h7KZXvkAzSRmNVKWzlsAg5AcTpLSs3ZAAAAFQCD0574+lRlF0WONMSuWeQDRFM4vwAAAIEAz1nRhBHZY+bFMZKMjuRnVzEddOWB/3iWEpJyOuyQWDEWYhAOEjB2hAId5Qsf+bNhscAyeKgJRNwn2KQMA2kX3O2zcfSdpSAGEgtTONX93XKkfh6JseTiFWos9Glyd04jlWzMbwjdpWvwlZjmvPI3ATsv7bcwHji3uA75PznVUikAAACBANjcvCxlW1Rjo92s7KwpismWfcpVqY7n5LxHfKRVqhr7vg/TIhs+rAK1XF/AWxyn8MHt0qlWxnEkbBoKIO5EFTvxCpHUR4TcHCx/Xkmtgeq5jWZ3Ja2bGBC3b47bHHNdDJLU2ttXysWorTXCoSYH82jr7kgP5EV+nPgwDhIMscpk cshpc.rrze.fau.de
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNVzp97t3CxlHtUiJ5ULqc/KLLH+Zw85RhmyZqCGXwxBroT+iK1Quo1jmG6kCgjeIMit9xQAHWjS/rxrlI10GIw= cshpc.rrze.fau.de
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPSIFF3lv2wTa2IQqmLZs+5Onz1DEug8krSrWM3aCDRU cshpc.rrze.fau.de
1024 35 135989634870042614980757742097308821255254102542653975453162649702179684202242220882431712465065778248253859082063925854525619976733650686605102826383502107993967196649405937335020370409719760342694143074628619457902426899384188195801203193251135968431827547590638365453993548743041030790174687920459410070371 cshpc.rrze.fau.de
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAs0wFVn1PN3DGcUtd/JHsa6s1DFOAu+Djc1ARQklFSYmxdx5GNQMvS2+SZFFa5Rcw+foAP9Ks46hWLo9mOjTV9AwJdOcSu/YWAhh+TUOLMNowpAEKj1i7L1Iz9M1yrUQsXcqDscwepB9TSSO0pSJAyrbuGMY7cK8m6//2mf7WSxc= cshpc.rrze.fau.de

fingerprints of the SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)

1024 SHA256:A82eA7py46zE/TrSTCRYnJSW7LZXY16oOBxstJF3jxU cshpc.rrze.fau.de (DSA)
256  SHA256:wFaDywle3yJvygQ4ZAPDsi/iSBTaF6Uoo0i0z727aJU cshpc.rrze.fau.de (ECDSA)
256  SHA256:is52MRsxMgxHFn58o0ZUh8vCzIuE2gYanmhrxdy0rC4 cshpc.rrze.fau.de (ED25519)
1024 SHA256:Za1mKhTRFDXUwn7nhPsWc7py9a6OHqS2jin01LJC3ro cshpc.rrze.fau.de (RSA)

While it is possible to ssh directly to a compute node, users are only allowed to do this while they have a batch job running there. When all batch jobs of a user on a node have ended, all of their processes, including any open shells, will be killed automatically.

Software environment

The login and compute nodes run AlmaLinux8 (which is basically Redhat Enterprise Linux 8 without the support).

The login shell for all users on Fritz is always bash and cannot be changed.

As on many other HPC systems,  environment modules are used to facilitate access to software packages. Type “module avail” to get a list of available packages. Even more packages will become visible once one of the 000-all-spack-pkgs modules has been loaded. Most of the software is installed using “Spack“ as enhanced HPC package manager.

General notes on how to use certain software on our systems (including in some cases sample job scripts) can be found on the Special applications, and tips & tricks pages. Specific notes on how some software provided via modules on the Fritz cluster has been compiled, can be found in the following accordion:

Intel tools (compiler, MPI, MKL, TBB)

Intel One API is installed in the “Free User” edition via Spack.

The modules intel (and the Spack internal intel-oneapi-compilers) provides the legacy Intel compilers icc, icpc, and ifort as well as the new LLVM-based ones (icx, icpx, dpcpp, ifx).

Recommended compiler flags are:  -O3 -xHost

If you want to enable full-width AVX512 SIMD support you have to additionally set the flag: -qopt-zmm-usage=high

The modules intelmpi (and the Spack internal intel-oneapi-mpi) provides Intel MPI. To use the legacy Intel compilers with Intel MPI you just have to use the appropriate wrappers with the Intel compiler names, i.e. mpiicc, mpiicpc, mpiifort. To use the new LLVM-based Intel compilers with Intel MPI you have to specify them explicitly, i.e use mpiicc -cc=icx, mpiicpc -cxx=icpx, or mpiifort -fc=ifx. The execution of mpicc, mpicxx, and mpif90 results in using the GNU compilers.

The modules mkland tbb(and the Spack internal intel-oneapi-mkl, and intel-oneapi-tbb) provide Intel MKL and TBB. Use Intel’s MKL link line advisor to figure out the appropriate command line for linking with MKL. The Intel MKL also includes drop-in wrappers for FFTW3.

Further Intel tools may be added in the future.

The Intel modules on Fritz, Alex and the Slurm-TinyGPU/TinyFAT behave different than on the older RRZE systems: (1) The intel64 module has been renamed to intel and no longer automatically loads intel-mpi and mkl. (2) intel-mpi/VERSION-intel and intel-mpi/VERSION-gcc have been unified into intel-mpi/VERSION. The selection of the compiler occurs by the wrapper name, e.g. mpicc = GCC, mpiicc = Intel; mpif90 = GFortran; mpiifort = Intel.

GNU compiler (gcc/g++/gfortran)

The GPU compilers are available in the version coming with the operating system (currently 8.5.0) as well as modules (currently versions 9.4, 10.3, and 11.2).

Recommended compiler flags are: -O3 -xHost

If you want to enable full-width AVX512 SIMD support you have to additionally set the flag: -qopt-zmm-usage=high

Open MPI

Open MPI is the default MPI for the Fritz cluster. Usage of srun instead of mpirun is recommended.

Open MPI is built using Spack:

  • with the compiler mentioned in the module name; the corresponding compiler will be loaded as dependency when the Open MPI modules is loaded
  • without support for thread-multiple
  • with fabrics=ucx
  • with support for Slurm as scheduler (and internal PMIx of Open MPI)

Python and conda environments

Do not rely on the Python installation from the operating system. Use our python modules instead. These installations will be updated in place from time to time. We can add further packages from the Miniconda distribution as needed.

You can modify the Python environment as follows:

Set the location where pip and conda install packages to $WORK, see Python and Jupyter for details. By default packages will be installed in $HOME, which has limited capacity.

Extend the base environment
$ pip install --user <packages>'

Create a new one of your own
$ conda create -n <environment_name> <packages>'

Clone and modify this environment

$ conda create --name myclone --clone base
$ conda install --name myclone new_package

See also https://docs.conda.io/projects/conda/en/latest/user-guide/getting-started.html.

GDB - GNU Project debugger

When using gdb -p <pid> (or the equivalent attach <pid>  command in gdb) to attach to a process running in a SLURM job, you might encounter errors or warnings related to executable and library files than cannot be opened. Such issues will also prevent symbols from being resolved correctly, making debugging really difficult.

The reason that this happens is that processes in a SLURM job get a slightly different view of file system mounts (using a so-called namespace). When you want to attach GDB to a running process and use SSH to log into the node where the process is running, the gdb  process will not be in the same namespace, causing GDB to have issues to directly access the binary (and its libraries) you’re trying to debug.

The workaround is to use a slightly different method for attaching to the process:

  1. $ gdb <executable>
  2. (gdb) set sysroot /
  3. (gdb) attach <pid>

(Thanks to our colleagues at SURFsara for figuring this out!)

Arm DDT

Arm DDT is a powerful parallel debugger. NHR@FAU holds a license for 32 processes.

Amber

NHR@FAU holds a “compute center license” of Amber, thus, Amber is generally available to everyone for non-profit use, i.e. for academic research.

Amber usually delivers the most economic performance using GPGPUs. Thus, the Alex GPGPU cluster might be a better choice.

Gromacs

We provide Gromacs versions without and with PLUMED. Gromacs (and PLUMED) are built using Spack.

Gromacs often delivers the most economic performance if GPGPUs are used. Thus the Alex GPGPU cluster might be a better choice.

If running on Fritz, it is mandatory in most cased to optimizes the number of PME processes experimentally. “pme_tune” REQUIRES FURTHER WORK AS A NON-MPI BINARY HAS TO BE USED

TODO: How to exactly run gmx pme_tune …

Do not start gmx mdrun with the option -v. The verbose output will only create extra large Slurm stdout files and your jobs will suffer if the NFS servers have high load. There is also only very limited use to see in the stdout all the time when the job is expected to reach the specified number of steps.

LAMMPS

The modules lammps/20211027-gcc11.2.0-ompi-mkl has been compiled using Gcc-11.2.0, Open MPI 4.1.1, and Intel OneAPI MKL using
  • -DBUILD_SHARED_LIBS:BOOL=ON -DLAMMPS_EXCEPTIONS:BOOL=OFF -DBUILD_MPI=ON -DBUILD_OMP:BOOL=ON -DPKG_OPENMP=ON -DPKG_GPU=OFF -DBUILD_LIB=ON -DWITH_JPEG:BOOL=ON -DWITH_PNG:BOOL=ON -DWITH_FFMPEG:BOOL=ON -DPKG_ASPHERE=ON -DPKG_BODY=ON -DPKG_CLASS2=ON -DPKG_COLLOID=ON -DPKG_COMPRESS=ON -DPKG_CORESHELL=ON -DPKG_DIPOLE=ON -DPKG_GRANULAR=ON -DPKG_KSPACE=ON -DPKG_KOKKOS=ON -DPKG_LATTE=ON -DPKG_MANYBODY=ON -DPKG_MC=ON -DPKG_MEAM=OFF -DPKG_MISC=ON -DPKG_MLIAP=OFF -DPKG_MOLECULE=ON -DPKG_MPIIO=ON -DPKG_OPT=OFF -DPKG_PERI=ON -DPKG_POEMS=ON -DPKG_PYTHON=ON -DPKG_QEQ=ON -DPKG_REPLICA=ON -DPKG_RIGID=ON -DPKG_SHOCK=ON -DPKG_SNAP=ON -DPKG_SPIN=ON -DPKG_SRD=ON -DPKG_USER-ATC=ON -DPKG_USER-ADIOS=OFF -DPKG_USER-AWPMD=OFF -DPKG_USER-BOCS=OFF -DPKG_USER-CGSDK=OFF -DPKG_USER-COLVARS=OFF -DPKG_USER-DIFFRACTION=OFF -DPKG_USER-DPD=OFF -DPKG_USER-DRUDE=OFF -DPKG_USER-EFF=OFF -DPKG_USER-FEP=OFF -DPKG_USER-H5MD=ON -DPKG_USER-LB=ON -DPKG_USER-MANIFOLD=OFF -DPKG_USER-MEAMC=ON -DPKG_USER-MESODPD=OFF -DPKG_USER-MESONT=OFF -DPKG_USER-MGPT=OFF -DPKG_USER-MISC=ON -DPKG_USER-MOFFF=OFF -DPKG_USER-NETCDF=ON -DPKG_USER-OMP=ON -DPKG_USER-PHONON=OFF -DPKG_USER-PLUMED=OFF -DPKG_USER-PTM=OFF -DPKG_USER-QTB=OFF -DPKG_USER-REACTION=OFF -DPKG_USER-REAXC=ON -DPKG_USER-SDPD=OFF -DPKG_USER-SMD=OFF -DPKG_USER-SMTBQ=OFF -DPKG_USER-SPH=OFF -DPKG_USER-TALLY=OFF -DPKG_USER-UEF=OFF -DPKG_USER-YAFF=OFF -DPKG_VORONOI=ON -DPKG_KIM=ON -DFFT=MKL -DEXTERNAL_KOKKOS=ON

NAMD

NAMD comes with a license which prohibits us to “just install and everyone can use it”. We, therefore, need individual users to print and sign the NAMD license. Subsequently, we will set the permissions accordingly.

TODO – no module yet

At the moment, we provide the official pre-built Linux-x86_64-multicore binary.

VASP

VASP comes with a license which prohibits us to “just install and everyone can use it”. We have to individually check each VASP user.

At the moment we provide VASP 5.4.x and VASP 6.3.x modules to eligible users.

The module vasp6/6.3.0-hybrid-intel-impi-AVX2-with-addons includes DFTD4, libbeef, and sol_compat/VASPsol.

Feel free to compile software in the versions and with the options you need yourself. This is perfectly fine, yet support for self-installed software cannot be granted. We only can provide software centrally which is of importance for multiple groups. If you want to use Spack for compiling additional software, you can load our user-spack module to make use of the packages we already build with Spack if the concretization match instead of starting from scratch. Once user-spack is loaded, the command spack will be available (as alias), you will inherit the pre-sets we defined for certain packages (e.g. Open MPI to work with Slurm), but you’ll install everything into your own directories ($WORK/USER-SPACK).

You can also bring your own environment in a container using Singularity (nowadays called Apptainer). However, building Singularity containers on the HPC systems themselves is not supported (as that would require root access). The Infiniband drivers from the host are not mounted into your container. All filesystems will also be available by default in the container. In certain use cases it might be a good idea to avoid bind-mounting your normal $HOME directory with all its “dot directories” into the container by explicitly specifying a different directory, e.g. -H $HOME/my-container-home.

File Systems

The following table summarizes the available file systems and their features. It is only an excerpt from the description of the HPC file system.

File system overview for the Meggie cluster
Mount point Access via Purpose Technology, size Backup Data lifetime Quota
/home/hpc $HOME Storage of source, input and important results NFS on central servers, small YES + Snapshots Account lifetime YES (restrictive)
/home/vault $HPCVAULT Medium- to long-term high-quality storage central servers YES + Snapshots Account lifetime YES
/home/{woody, saturn, titan, janus, atuin} $WORK Short- to medium-term storage or small files central NFS server NO Account lifetime YES
/lustre $FASTTMP High performance parallel I/O; short-term storage Lustre-based parallel file system, 3.5 PB NO High watermark deletion NO

 

NFS file system $HOME

When connecting to one of the front end nodes, you’ll find yourself in your regular HPC $HOME directory (/home/hpc/...). There are relatively tight quotas there, so it will most probably be too small for the inputs/outputs of your jobs. It, however, does offer a lot of nice features, like fine-grained snapshots, so use it for “important” stuff, e.g. your job scripts, or the source code of the program you’re working on. See the HPC file system page for a more detailed description of the features.

Quota in $HOME is very limited as snapshots are made every 30 minutes. Put simulation data to $WORK! Do not rely on the specific path of $WORK as this may change over time when your work directory is relocated to a different NFS server.

Parallel file system $FASTTMP

The cluster’s parallel file system is mounted on all nodes under /lustre/$GROUP/$USER/ and available via the $FASTTMP environment variable. It supports parallel I/O using the MPI-I/O functions and can be accessed with an aggregate bandwidth of > 20 GB/s.

The parallel file system is strictly intended to be a high-performance short-term storage, so a high watermark deletion algorithm is employed: When the filling of the file system exceeds a certain limit (e.g. 80%), files will be deleted starting with the oldest and largest files until a filling of less than 60% is reached.

Note that parallel filesystems generally are not made for handling large amounts of small files. This is by design: Parallel filesystems achieve their amazing speed by writing to multiple different servers at the same time. However, they do that in blocks, in our case 1 MB. That means that for a file that is smaller than 1 MB, only one server will ever be used, so the parallel filesystem can never be faster than a traditional NFS server – on the contrary: due to larger overhead, it will generally be slower. They can only show their strengths with files that are at least a few megabytes in size, and excel if very large files are written by many nodes simultaneous (e.g. checkpointing).
For that reason, we have set a limit on the number of files you can store there.

Batch processing

As with all production clusters at RRZE, resources are controlled through a batch system. The front ends can be used for compiling and very short serial test runs, but everything else has to go through the batch system to the cluster.

Fritz uses SLURM as a batch system. Please see our general batch system description for further details.

The granularity of batch allocations are complete nodes, i.e. nodes are never shared. As a parallel computer, Fritz is not made for single-node jobs as a lot of money was spent for the fast HDR100 interconnect.

 

Partitions on the Fritz cluster (preliminary definition)
Partition min – max walltime min – max nodes availability Comments
singlenode 0 – 24:00:00 1 always Jobs run on nodes without Infiniband; nodes are exclusive
multinode 0 – 24:00:00 1-32 on demand Jobs run on nodes with Infiniband; nodes are exclusive

The partition configuration is still subject to change for final production use.

Interactive job (single-node)

Interactive jobs can be requested by using salloc instead of sbatch and specifying the respective options on the command line.

The following will give you an interactive shell on one node for one hour:

salloc -N 1 --partition=singlenode --time=01:00:00

Settings from the calling shell (e.g. loaded module paths) will be inherited by the interactive job!

Interactive job (multi-node)

Interactive jobs can be requested by using salloc instead of sbatch and specifying the respective options on the command line.

The following will give you four nodes with an interactive shell on the first node for one hour:

salloc -N 4 --partition=multinode --time=01:00:00

Settings from the calling shell (e.g. loaded module paths) will be inherited by the interactive job!

MPI parallel job (single-node)

In this example, the executable will be run on one node, using 72 MPI processes, i.e. one per physical core.

#!/bin/bash -l
#SBATCH --nodes=1 
#SBATCH --ntasks-per-node=72
#SBATCH --partition=singlenode
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV
module load XXX 

srun ./mpi_application

OpenMP job (single-node)

In this example, the executable will be run using 72 OpenMP threads (i.e. one per physical core) for a total job walltime of 1 hour.

For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores, OMP_PROC_BIND=true. For more information, see e.g. the HPC Wiki.

#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=72
#SBATCH --partition=singlenode
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV 
module load XXX 

# set number of threads to requested cpus-per-task
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK 
./openmpi_application

Hybrid OpenMP/MPI job (single-node)

In this example, the executable will be run using 2 MPI processes with  36 OpenMP threads (i.e. one per physical core) for a total job walltime of 1 hour.

For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores, OMP_PROC_BIND=true. For more information, see e.g. the HPC Wiki.

#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --partition=singlenode
#SBATCH --time=1:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV 
module load XXX 

# set number of threads to requested cpus-per-task export 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK 
srun ./hybrid_application

MPI parallel job (multi-node)

In this example, the executable will be run on four nodes, using 72 MPI processes per node, i.e. one per physical core.

#!/bin/bash -l
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=72
#SBATCH --partition=multinode
#SBATCH --time=1:0:0
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV 
module load XXX 

srun ./mpi_application

Hybrid OpenMP/MPI job (multi-node)

In this example, the executable will be run using on four nodes with 2 MPI processes per node and 36 OpenMP threads each (i.e. one per physical core) for a total job walltime of 1 hour.

For more efficient computation, OpenMP threads should be pinned to the compute cores. This can be achieved by the following environment variables: OMP_PLACES=cores, OMP_PROC_BIND=true. For more information, see e.g. the HPC Wiki.

#!/bin/bash -l
#SBATCH --partition=multinode
#SBATCH --nodes=4
#SBATCH --ntasks-per-node=2
#SBATCH --cpus-per-task=36
#SBATCH --time=01:00:00
#SBATCH --export=NONE

unset SLURM_EXPORT_ENV 
module load XXX 

# set number of threads to requested cpus-per-task export 
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK 
srun ./hybrid_application

Further Information

Intel Xeon Platinum 8360Y “IceLake” Processor

Hyperthreading (SMT) is disabled; sub-NUMA clustering (Cluster-on-Die, CoD) is activated. This results in 4 NUMA domains with 18 cores each per compute node.

The processor can be operated in 3 modes; in Fritz it’s running in its default mode with 36 cores and 250 W TDP.

Launch Date Q2’21
Lithography 10 nm
Total Cores (Threads) 36 (72 – SMT is disabled on Fritz)
Max Turbo Frequency (non-AVX code) 3.50 GHz (significantly lower for  heavy AVX2/AVX512 workload)
Processor Base Frequency (non-AVX code) 2.40 GHz (significantly lower for heavy AVX2/AVX512 workload)
Last level cache (L3) 54 MB
# of UPI Links 3
TDP 250 W
Memory Channels & Memory Type 8 channels DDR4 @ 3200 per socket (in Fritz: 16x 16 GB DRR4-3200 per node)
Instruction Set Extensions Intel SSE4.2, Intel AVX, Intel AVX2, Intel AVX-512
# of AVX-512 FMA Units 2

See https://ark.intel.com/content/www/us/en/ark/products/212459/intel-xeon-platinum-8360y-processor-54m-cache-2-40-ghz.html for full processor details.

Network topology

Fritz uses unmanaged 40 port Mellanox HDR switches. 8 HDR200 links per edge switch are connected to the spine level. Using splitter cables, 64 compute nodes are connected with HDR100 per each edge switch. This results in a 1:4 blocking of the fat tree. Each island with 64 nodes has a total of 4.608 cores. Slurm is aware of the topology, but minimizing the number of switches per jobs does not have a high priority.

Direct liquid cooling (DLC) of the compute nodes

Erlangen National High Performance Computing Center (NHR@FAU)
Martensstraße 1
91058 Erlangen
Germany
  • Imprint
  • Privacy
  • Accessibility
  • How to find us
Up