This website shows information regarding the following topics:
- File systems
- Advanced Topics
- Further information on HPC storage
A number of file systems is available at RRZE. They differ in available storage size, backup and also in their intended use. Please consider these properties when looking for a place to store your files. More details on the respective systems are listed below.
There is one simple logic rule to keep in mind: Everything that starts with
/home/ is available throughout the RRZE, which naturally includes all HPC systems. Therefore, e.g.
/home/woody is accessible from all clusters, even if it was originally bought together with the Woody-Cluster and mainly for use by the Woody cluster.
|Mount point||Access via||Purpose||Size||Backup||Data lifetime||Quota||Remarks|
||Storage of source, input and important results||40 TB||Yes||Account lifetime||Yes (restrictive)|
||Mid- to long term storage; especially for large file||2 PB||Yes||Account lifetime||Yes|
||general purpose work directory and storage for small files||n/a||NO||Account lifetime||Yes||will point to either
||general purpose work directory and storage for small files (used to be cluster local storage for woody cluster)||88 TB||(No)||Account lifetime||Yes||There may be limited backup, meaning that backup on this filesystem does not run daily and data is only kept in backup for a rather short time.|
(only defined if you are eligible)
|general purpose work directory and storage for small to large files||2x 300 TB||NO||Account lifetime||Yes (group quota)||There is no backup and it is a shareholder-only filesystem, i.e. only groups who paid for the file server have access.|
||High performance parallel I/O; short-term storage; no large ASCII files!||430 TB||NO||High watermark deletion||No; but number of files/directories limited||only available on the emmy cluster|
||High performance parallel I/O; short-term storage; no large ASCII files!||850 TB||NO||High watermark deletion||No; but number of files/directories limited||only available on the meggie cluster|
often somewhere in
||job-specific storage (either located in main memory [RAM disk] or if available local HDD / SDD)||from some GB to several hunderds of GB||NO|| job lifetime
||No; but space is o f course very limited||it’s always node-local only
see cluster specific documentation for details especially concerning size
The Home directories of the HPC users are housed in the HPC storage system. These directories are available under the path
/home/hpc/GROUPNAME/USERNAME on all RRZE HPC systems. The home directory is the directory, in which you are placed right after login, and where most programs try to save settings and similar things. When this directory is unavailable, most programs will stop working or show really strange behaviour – which is why we tried to make the system highly redundant.
The home directory is protected by fine-grained snapshots, and additionally by regular backups. It should therefore be used for „important“ data, e.g. your job scripts, source code of the program you’re working on, or unrecoverable input files. There are comparatively small quotas there, so it will most probably be too small for the inputs/outputs of your jobs.
Each user gets a standard quota of 50 Gigabytes for the home. Quota extensions are not possible.
Additional storage is provided on in a second part of the HPC storage system called “vault”. Each HPC user has a directory there that is available under the path
/home/vault/GROUPNAME/USERNAME on all RRZE HPC systems.
This filesystem is also protected by regular snapshots and backups, although not as fine-grained as on $HOME. It is suitable for mid and long-term storage of files.
The default quota for each user is 500 Gigabytes.
The recommended work directory is
$WORK. Its destination may point to different file servers and file systems:
Despite the name,
$WOODYHOME is available from all HPC systems under the path
/home/woody/GROUPNAME/USERNAME. It is intended as a general purpose work directory and should be used for input/output files and as a storage location for small files.
However, bear in mind that backup on
$WOODYHOME is limited. Backup is not run daily and is also only kept for a short amount of time. Hence, important data should be archived in other locations.
The standard quota for each user is 200 Gigabytes.
Access to this shareholder-only filesystem is only available for eligible users. It is intended as a general purpose work directory for both small and large files. Keep in mind that no backup or snapshots are available here.
The quota for this file system is defined for the whole group, not for the individual user. It is dependent on the respective share the group has paid for. If your group is interested in contributing, please contact HPC Services.
Share holders can lookup information on their group quota on
$SATURNHOME in text files available as
The emmy and meggie cluster have a local parallel filesystem for high performance short-term storage. Please note that they are entirely different systems, i.e. you cannot see the files on emmy’s
$FASTTMP in the
$FASTTMP on meggie. They are not available on systems outside of the respective clusters.
The parallel file systems use a high watermark deletion algorithm: When the filling of the file system exceeds a certain limit (e.g. 70%), files will be deleted starting with the oldest and largest files until a filling of less than 60% is reached. Be aware that the normal
tar -x command preserves the modification time of the original file instead of the time when the archive is unpacked. So unpacked files may become one of the first candidates for deletion. Use
tar -mx or
touch in combination with
find to work around this. Be aware that the exact time of deletion is unpredictable.
Note that parallel filesystems generally are not made for handling large amounts of small files or ASCII files. This is by design: Parallel filesystems achieve their amazing speed by writing binary streams to multiple different servers at the same time. However, they do that in blocks, in our case 1 MB. That means that for a file that is smaller than 1 MB, only one server will ever be used, so the parallel filesystem can never be faster than a traditional NFS server – on the contrary: due to larger overhead, it will generally be slower. They can only show their strengths with files that are at least a few megabytes in size, and excel if very large files are written by many nodes simultaneously (e.g. checkpointing).
Snapshots work mostly as the name suggests. In certain intervals, the filesystem takes a “snapshot”, which is an exact read-only copy of the contents of the whole filesystem at one moment in time. In a way, a snapshot is similar to a backup, but with one great restriction: As the “backup” is stored on the exact same filesystem, this is no protection against disasters – if for some reason the filesystem fails, all snapshots will be gone as well. Snapshots do however provide great protection against user errors, which has always been the number one cause of data loss on the RRZE HPC systems. Users can restore Important files that have been deleted or overwritten from an earlier snapshot.
Snapshots are stored in a hidden directory
.snapshots. Please note that this directory is more hidden than usual: It will not even show up on
ls -a, it will only appear when it is explicitly requested.
This is best explained by an example: let’s assume you have a file
important.txt in your home directory
/home/hpc/exam/example1 that you have been working on for months. You accidentally delete that file. Thanks to snapshots, you should be able to recover most of the file, and “only” lose the last few hours of work. If you do a
ls -l /home/hpc/exam/example1/.snapshots/, you should see something like this:
drwx------ 49 example1 exam 32768 8. Feb 10:54 @GMT-2019.02.10-03.00.00 drwx------ 49 example1 exam 32768 16. Feb 18:06 @GMT-2019.02.17-03.00.00 drwx------ 49 example1 exam 32768 24. Feb 00:15 @GMT-2019.02.24-03.00.00 drwx------ 49 example1 exam 32768 28. Feb 23:06 @GMT-2019.03.01-03.00.00 drwx------ 49 example1 exam 32768 1. Mär 21:34 @GMT-2019.03.03-03.00.00 drwx------ 49 example1 exam 32768 1. Mär 21:34 @GMT-2019.03.02-03.00.00 drwx------ 49 example1 exam 32768 3. Mär 23:54 @GMT-2019.03.04-03.00.00 drwx------ 49 example1 exam 32768 4. Mär 17:01 @GMT-2019.03.05-03.00.00
Each of these directories contains an exact read-only copy of your home directory at the time that is given in the name. To restore the file in the state as it was at 3:00 UTC on the 5th of March, you can just copy it from there to your current work directory again:
cp '/home/hpc/exam/example1/.snapshots/@GMT-2019.03.05-03.00.00/important.txt' '/home/hpc/exam/example1/important.txt'
Snapshots are enabled on both the home directories and vault section, but they are made much more often on the home directories than on vault. Please note that the exact snapshot intervals and the number of snapshots retained may change at any time – you should not rely on the existence of a specific snapshot. Also note that any times given are in GMT / UTC. That means that, depending on whether daylights saving time is active or not, the 03:00 UTC works out to either 05:00 or 04:00 german time. At the time of this writing, snapshots were configured as follows:
|Interval||x Copies retained||= covered timespan|
|30 minutes (every half and full hour)||6||3 hours|
|2 hours (every odd-numbered hour – 01:00, 03:00, 05:00, …)||12||1 day|
|1 day (at 03:00)||7||1 week|
|1 week (Sundays at 03:00)||4||4 weeks|
|Interval||x Copies retained||= covered time span|
|1 day (at 03:00)||7||1 week|
|1 week (Sundays at 03:00)||4||4 weeks|
Please note that having a large number of small files is pretty bad for the filesystem performance. This is actually true for almost any filesystem and certainly for all RRZE fileservers, but it is a bit tougher for the HPC storage system (
$HPCVAULT) due to the underlying parallel filesystem and the snapshots. We have therefore set a limit on the number of files a user is allowed. That limit is set rather high for the home section, so that you are unlikely to hit it unless you try to, because small files are part of the intended usage there. It is however set rather tight on the vault section, especially compared to the large amount of space available there. If have you are running into the file limit, you can always put small files that you don’t use regularly into an archive (tar, zip, etc.).
Besides the normal Unix permissions that you set with chmod (where you can set permissions for the owning user, the owning group, and everyone else), the system also supports more advanced ACLs.
However, they are not done in the traditional (and non-standardized) way with setfacl / getfacl that users of Linux or Solaris might be familiar with, but in the new standardized way that NFS version 4 uses. You can set these ACLs from an NFS client (e.g. a cluster frontend) under Linux using the
nfs4_getfacl commands. The ACLs are also practically compatible with what Windows does, meaning that you can edit them from a Windows client through the usual explorer interface.
The system serves two functions: It houses the normal home directories of all HPC users, and it provides tape-backed mid- to longterm storage for users data. It is based on Lenovo hardware and IBM software (Spectrum Scale/GPFS) and took up operation in September 2020.
- 5 file servers, Lenovo ThinkSystem SR650, 128 GB RAM, 100 GB Ethernet
- 1 archive frontend, Lenovo ThinkSystem SR650, 128 GB RAM, 100 GB Ethernet
- 1 TSM server, Lenovo ThinkSystem SR650, 512 GB RAM, 100 GB Ethernet
- IBM TS4500 tape library with currently
- 8 LTO8 tape drives and two expansion frames
- 3370 LTO8 tape slots
- >700 LTO7tapes
- 4 Lenovo DE6000H storage arrays
- plus 8 Lenovo DE600S expansion units (2 per DE6000H)
- redundant controllers
- Usable data capacity: 2 PB for vault, 40 TB for homes