Fritz parallel cluster (NHR+Tier3)

FAU’s Fritz cluster (system integrator: Megware) is a high-performance compute resource with high speed interconnect, i.e., a parallel computer. It is intended for multi-node parallel workloads. Fritz serves for both, FAU’s basic Tier3 resources as well as NHR’s project resources.

  • 4 front end nodes with the same CPUs as the compute nodes but 512 GB of RAM, and 100 GbE connection to RRZE’s network backbone.
  • 944 compute nodes with direct liquid cooling (DLC), each with two Intel Xeon Platinum 8360Y “IceLake” processors (36 cores per chip) running at a base frequency of 2.4 GHz and 54 MB Shared L3 cache per chip, 256 GB of DDR4-RAM.
  • Lustre-based parallel filesystem with a capacity of about 3 PB and an aggregated parallel I/O bandwidth of > 20 GB/s.
  • Blocking HDR100 Infiniband with up to 100 GBit/s bandwidth per link and direction. There are islands with 64 nodes (i.e. 4.608 cores). The blocking factor between islands is 1:4.
  • Measured LINPACK performance of ### TFlop/s.

The name “Fritz” is a play with the name of  FAU’s founder Friedrich, Margrave of Brandenburg-Bayreuth (1711-1763).

Fritz is not (fully) delivered yet. It is not yet ready for use!

All documentation is preliminary and subject to change.

This website shows information regarding the following topics:

Access, User Environment, and File Systems

Access to the machine

Note that access to Fritz is not yet open. If you want to be among the firsts to get access to Fritz once early operation starts, you will need to provide a short description of what you want to do there.

Users can connect to fritz.nhr.fau.de (keep the “nhr” instead of “rrze” in mind!) and will be randomly routed to one of the four front ends. All systems in the cluster, including the front ends, have private IPv4 addresses in the 10.28.64.0/20 and IPv6 addresses in the 2001:638:a000:3964::/64 range. They can normally only be accessed directly from within the FAU networks. There is one exception: If your internet connection supports IPv6, you can directly ssh to the front  ends (but not to the compute nodes). Otherwise, if you need access from outside of FAU, you usually have to connect for example to the dialog server cshpc.rrze.fau.de first and then ssh to fritz.nhr.fau.de from there.

SSH public host keys of fritz.nhr.fau.de (as of 11/2021)
ssh-dss ### fritz.nhr.fau.de
ecdsa-sha2-nistp256 ### fritz.nhr.fau.de
ssh-ed25519 ### fritz.nhr.fau.de
ssh-rsa ### fritz.nhr.fau.de

fingerprints of the SSH public host keys of fritz.nhr.fau.de (as of 11/2021)

1024 SHA256:### fritz.nhr.fau.de (DSA)
256 SHA256:### fritz.nhr.fau.de (ECDSA)
256 SHA256:### fritz.nhr.fau.de (ED25519)
3072 SHA256:### fritz.nhr.fau.de (RSA)

SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)

ssh-dss AAAAB3NzaC1kc3MAAACBAO2L8+7bhJm7OvvJMcdGSJ5/EaxvX5RRzE9RrB8fx5H69ObkqC6Baope4rOS9/+2gtnm8Q3gZ5QkostCiKT/Wex0kQQUmKn3fx6bmtExLq8YwqoRXRmNTjBIuyZuZH9w/XFK36MP63p/8h7KZXvkAzSRmNVKWzlsAg5AcTpLSs3ZAAAAFQCD0574+lRlF0WONMSuWeQDRFM4vwAAAIEAz1nRhBHZY+bFMZKMjuRnVzEddOWB/3iWEpJyOuyQWDEWYhAOEjB2hAId5Qsf+bNhscAyeKgJRNwn2KQMA2kX3O2zcfSdpSAGEgtTONX93XKkfh6JseTiFWos9Glyd04jlWzMbwjdpWvwlZjmvPI3ATsv7bcwHji3uA75PznVUikAAACBANjcvCxlW1Rjo92s7KwpismWfcpVqY7n5LxHfKRVqhr7vg/TIhs+rAK1XF/AWxyn8MHt0qlWxnEkbBoKIO5EFTvxCpHUR4TcHCx/Xkmtgeq5jWZ3Ja2bGBC3b47bHHNdDJLU2ttXysWorTXCoSYH82jr7kgP5EV+nPgwDhIMscpk cshpc.rrze.fau.de
ecdsa-sha2-nistp256 AAAAE2VjZHNhLXNoYTItbmlzdHAyNTYAAAAIbmlzdHAyNTYAAABBBNVzp97t3CxlHtUiJ5ULqc/KLLH+Zw85RhmyZqCGXwxBroT+iK1Quo1jmG6kCgjeIMit9xQAHWjS/rxrlI10GIw= cshpc.rrze.fau.de
ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIPSIFF3lv2wTa2IQqmLZs+5Onz1DEug8krSrWM3aCDRU cshpc.rrze.fau.de
1024 35 135989634870042614980757742097308821255254102542653975453162649702179684202242220882431712465065778248253859082063925854525619976733650686605102826383502107993967196649405937335020370409719760342694143074628619457902426899384188195801203193251135968431827547590638365453993548743041030790174687920459410070371 cshpc.rrze.fau.de
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAs0wFVn1PN3DGcUtd/JHsa6s1DFOAu+Djc1ARQklFSYmxdx5GNQMvS2+SZFFa5Rcw+foAP9Ks46hWLo9mOjTV9AwJdOcSu/YWAhh+TUOLMNowpAEKj1i7L1Iz9M1yrUQsXcqDscwepB9TSSO0pSJAyrbuGMY7cK8m6//2mf7WSxc= cshpc.rrze.fau.de

fingerprints of the SSH public host keys of cshpc.rrze.fau.de (as of 11/2021)

1024 SHA256:A82eA7py46zE/TrSTCRYnJSW7LZXY16oOBxstJF3jxU root@wtest05 (DSA)
256 SHA256:wFaDywle3yJvygQ4ZAPDsi/iSBTaF6Uoo0i0z727aJU root@cshpc (ECDSA)
256 SHA256:is52MRsxMgxHFn58o0ZUh8vCzIuE2gYanmhrxdy0rC4 root@cshpc (ED25519)
1024 SHA256:Za1mKhTRFDXUwn7nhPsWc7py9a6OHqS2jin01LJC3ro root@wtest05 (RSA)

While it is possible to ssh directly to a compute node, users are only allowed to do this while they have a batch job running there. When all batch jobs of a user on a node have ended, all of their processes, including any open shells, will be killed automatically.

Software environment

The login and compute nodes run AlmaLinux8 (which is basically Redhat Enterprise Linux 8 without the support).

The login shell for all users on Fritz is always bash and cannot be changed.

As on many other HPC systems,  environment modules are used to facilitate access to software packages. Type “module avail” to get a list of available packages. Even more packages will become visible once one of the 000-all-spack-pkgs modules has been loaded. Most of the software is installed using “Spack as enhanced HPC package manager.

General notes on how to use certain software on our systems (including in some cases sample job scripts) can be found on the Special applications, and tips & tricks pages. Specific notes on how some software provided via modules on the Fritz cluster has been compiled, can be found in the following accordion:

Intel One API is installed in the “Free User” edition via Spack.

The modules intel and intel-oneapi-compilers provides the legacy Intel compilers icc, icpc, and ifort as well as the new LLVM-based ones (icx, icpx, dpcpp, ifx).

The modules intelmpi and intel-oneapi-mpi provides Intel MPI. To use the legacy Intel compilers with Intel MPI you just have to use the appropriate wrappers with the Intel compiler names, i.e. mpiicc, mpiicpc, mpiifort. To use the new LLVM-based Intel compilers with Intel MPI you have to specify them explicitly, i.e use mpiicc -cc=icx, mpiicpc -cxx=icpx, or mpiifort -fc=ifx. The execution of mpicc, mpicxx, and mpif90 results in using the GNU compilers.

The modules mkl, tbb, intel-oneapi-mkl, and intel-oneapi-tbb provide Intel MKL and TBB. Use Intel’s MKL link line advisor to figure out the appropriate command line for linking with MKL. The Intel MKL also includes drop-in wrappers for FFTW3.

Further Intel tools may be added in the future.

Open MPI is the default MPI for the Fritz cluster (TO BE CONFIRMED). Usage of srun instead of mpirun is recommended. (TO BE CONFIRMED)

Open MPI is built using Spack:

  • with the compiler mentioned in the module name; the corresponding compiler will be loaded as dependency when the Open MPI modules is loaded
  • without support for thread-multiple
  • with fabrics=ofi
  • with support for Slurm as scheduler (and internal PMIx of Open MPI)

TBD

Amber is currently only available to eligible groups. We’ll upgrade to a compute center license in 2022 to make Amber generally available.

Amber usually delivers the most economic performance using GPGPUs. Thus, the Alex GPGPU cluster might be a better choice.

We provide Gromacs versions without and with PLUMED. Gromacs (and PLUMED) are built using Spack.

Gromacs often delivers the most economic performance if GPGPUs are used. Thus the Alex GPGPU cluster might be a better choice.

If running on Fritz, it is mandatory to optimizes the number of PME processes experimentally.

TODO – no module yet
The modules lammps/20211027* have been compiled using Gcc-10.3.0, Intel OpeAPI MKL, Open MPI 4.1.1, and with
  • KOKKOS package API: OpenMP Serial; KOKKOS package precision: double
  • Installed packages: ASPHERE BODY CLASS2 COLLOID COMPRESS CORESHELL DIPOLE GPU GRANULAR KIM KOKKOS KSPACE LATTE MANYBODY MC MISC MOLECULE MPIIO PERI POEMS PYTHON QEQ REPLICA RIGID SHOCK SPIN SRD VORONOI

NAMD comes with a license which prohibits us to “just install and everyone can use it”. We, therefore, need individual users to print and sign the NAMD license. Subsequently, we will set the permissions accordingly.

TODO – no module yet

At the moment, we provide the official pre-built Linux-x86_64-multicore binary.

VASP comes with a license which prohibits us to “just install and everyone can use it”. We have to individually check each VASP user.

TODO – no module yet

At the moment we provide a VASP 6.2.3 module to eligible users.

Feel free to compile software in the versions and with the options you need yourself. This is perfectly fine, yet support for self-installed software cannot be granted. We only can provide software centrally which is of importance for multiple groups. If you want to use Spack for compiling additional software, you can load our user-spack module to make use of the packages we already build with Spack if the concretization match instead of starting from scratch. Once user-spack is loaded, the command spack will be available (as alias), you will inherit the pre-sets we defined for certain packages (e.g. Open MPI to work with Slurm), but you’ll install everything into your own directories ($WORK/USER-SPACK).

You can also bring your own environment in a container using Singularity. However, building Singularity containers on the HPC systems themselves is not supported (as that would require root access). The Infiniband drivers from the host are not mounted into your container. All filesystems will also be available by default in the container. In certain usecases it might be a good idea to avoid bind-mounting your normal $HOME directory with all its “dot directories” into the container by explicitly specifying a different directory, e.g. -H $HOME/my-container-home.

File Systems

The following table summarizes the available file systems and their features. It is only an excerpt from the description of the HPC file system.

Further details will follow once Fritz is open for users.

Batch processing

As with all production clusters at RRZE, resources are controlled through a batch system. The front ends can be used for compiling and very short serial test runs which do not require a GPU, but everything else has to go through the batch system to the cluster.

Fritz uses SLURM as a batch system. Please see our general batch system description for further details.

The granularity of batch allocations are complete nodes, i.e. nodes are never shared. As a parallel computer, Fritz is not made for single-node jobs as a lot of money was spent for the fast HDR100 interconnect.

The following queues are available on this cluster:

Details on the queue configuration will follow once Fritz is open for users.

Further Information

Intel Xeon Platinum 8360Y “IceLake” Processor

Network topology

Direct liquid cooling (DLC) of the compute nodes