Woody throughput cluster

The RRZE’s “Woody” is the preferred cluster for serial/single-node throughput jobs.

The cluster has changed significantly over time. You can find more about the history in the section about history. The current hardware configuration looks like this:

  • 40 compute nodes (w10xx nodes) with Xeon E3-1280 CPUs (“SandyBridge”, 4 cores, HT disabled, 3,5 GHz base frequency; only AVX but no AVX2), 8 GB RAM, 500 GB HDD – from 12/2011 these nodes have been shutdown in October 2020
  • 70 compute nodes (w11xx nodes) with Xeon E3-1240 v3 CPUs (“Haswell”, 4 cores, HT disabled, 3,4 GHz base frequency), 8 GB RAM, 1 TB HDD – from 09/2013
  • 64 compute nodes (w12xx/w13xx nodes) with Xeon E3-1240 v5 CPUs (“Skylake”, 4 cores, HT disabled, 3,5 GHz base frequency), 32 GB RAM, 1 TB HDD – from 04/2016 and 01/2017
  • 112 compute nodes (w14xx/w15xx nodes) with Xeon E3-1240 v6 CPUs (“Kaby Lake”, 4 cores, HT disabled, 3,7 GHz base frequency), 32 GB RAM, 960 GB SDD – from Q3/2019
front of a rack, with servers in it and lots of cables

The w11xx nodes in Woody

This website shows information regarding the following topics:

Access, User Environment, and File Systems

Access to the machine

Access to the system is granted through the frontend nodes via ssh. Please connect to


and you will be randomly routed to one of the frontends. All systems in the cluster, including the frontends, have private IP addresses in the range. Thus they can only be accessed directly from within the FAU networks. If you need access from outside of FAU you have to connect for example to the dialog server cshpc.rrze.uni-erlangen.de first and then ssh to Woody from there. While it is possible to ssh directly to a compute node, a user is only allowed to do this when they have a batch job running there. When all batch jobs of a user on a node have ended, all of their shells will be killed automatically.

The login and compute nodes run a 64-bit Ubuntu LTS-version. As on most other RRZE HPC systems, a modules environment is provided to facilitate access to software packages. Type “module avail” to get a list of available packages.

File Systems

The following table summarizes the available file systems and their features. Also check the description of the HPC file systems.

File system overview for the Woody cluster
Mount point Access via Purpose Technology, size Backup Data lifetime Quota
/home/hpc $HOME Storage of source, input and important results central servers, 5 TB YES + Snapshots Account lifetime YES (very restrictive)
/home/vault Mid- to longterm storage central servers, HSM YES + Snapshots Account lifetime YES
/home/woody $WOODYHOME storage for small files NFS, 88 TB limited Account lifetime YES
/tmp $TMPDIR Temporary job data directory Node-local, between 400 and 900 GB NO Job runtime NO

Node-local storage $TMPDIR

Each node has at least 400 GB of local hard drive capacity for temporary files available under /tmp (also accessible via /scratch/). All files in these directories will be deleted at the end of a job without any notification.

If possible, compute jobs should use the local disk for scratch space as this reduces the load on the central servers. In batch scripts the shell variable $TMPDIR points to a node-local, job-exclusive directory whose lifetime is limited to the duration of the batch job. This directory exists on each node of a parallel job separately (it is not shared between the nodes). It will be deleted automatically when the job ends. Important data to be kept can be copied to a cluster-wide volume at the end of the job, even if the job is cancelled by a time limit. Please see the section on batch processing for examples on how to use $TMPDIR.

Batch Processing

All user jobs except short serial test runs must be submitted to the cluster by means of the Torque Resource Manager. The submitted jobs are routed into a number of queues (depending on the needed resources, e.g. runtime) and sorted according to some priority scheme. It is normally not necessary to explicitly specify the queue when submitting a job to the cluster, the sorting into the proper queue happens automatically.

Please see the batch system description for further details.

The following queues are available on this cluster. There is no need to specify a queue manually!

Queues on the Woody cluster
Queue min – max walltime Comments
route N/A Default router queue; sorts jobs into execution queues
devel 0 – 01:00:00 Some nodes reserved for queue during working hours
work 01:00:01 – 24:00:00 “Workhorse”
onenode 01:00:01 – 48:00:00 only very few jobs from this queue are allowed to run at the same time.

Regular jobs are always required to request all CPUs in a node (ppn=4). Using less than 4 CPUs per node is only supported in the SandyBridge segment.

If you submit jobs, then by default you can get any type of node: SandyBridge, Haswell, Skylake, or Kabylake based w1xxx-nodes. They all have the same number of cores (4) and minimum memory (at least 8 GB) per node, but the speed of the CPUs can be different, which means that job runtimes will vary. You will have to calculate the walltime you request from the batch system so that your jobs can finish even on the slowest nodes.

It is also possible to request certain kinds of nodes from the batch system. This has two mayor use cases besides the obvious “benchmarking”: If you want to run jobs that use less than a full node, those are currently only allowed on the SandyBridge nodes, so you need to request those explicitly. Some applications can benefit from using AVX2 which is not available on the SandyBridge based nodes. Moreover, the Skylake and Kabylake based nodes have more memory (32 GB). You request a node property by adding it to your -lnodes=... request string, e.g.: qsub -l nodes=1:ppn=4:hw. In general, the following node properties are available:

Available node properties on the Woody cluster
Property Matching nodes (#) Comments
(none specified) w1xxx (286) Can run on any node, that is all the SandyBridge, Haswell, Skylake, and Kabylake nodes.
:sb w10xx (40) Can run on the SandyBridge nodes only. Required for jobs with ppn other than 4.
:hw w11xx (70) Can run on the Haswell nodes only.
:sl32g w12xx and w13xx (64) Can run on the Skylake nodes (with 32 GB RAM) only.
:kl32g w14xx and w15xx (112) Can run on the Kabylake nodes (with 32 GB RAM) only.
:any32g w13xx, w14xx and w15xx  (176) Can run on the Skylake or Kabylake nodes (with 32 GB RAM) only.
:hdd900 w1[1-5]xx (246) Can run on any node with (at least) 900 GB scratch on HDD/SDD.

Note: The properties :any, :avx, :sb, :sl, and:sl16g are no longer supported.


The installed Intel compilers support at least the relevant parts of recent OpenMP standards. The compiler recognizes OpenMP directives if you supply the command line option -openmp or -qopenmp. This is also required for the link step.


Although the cluster is basically able to support many different MPI versions, we maintain and recommend to use Intel MPI. Intel MPI supports different compilers (GCC, Intel). If you use Intel compilers, the appropriate intelmpi module is loaded automatically upon loading the intel64 compiler module. The standard MPI scripts mpif77, mpif90, mpicc and mpicxx are then available. By loading a intelmpi/201X.XXX-gnu module instead of the default intelmpi, those scripts will use the GCC.

Further Information


The cluster was originally delivered end of 2006 by companies Bechtle and HP, with 180 compute-nodes, each with two Xeon 5160 “Woodcrest” chips (4 cores) running at 3.0 GHz with 4 MB Shared Level 2 Cache per dual core, 8 GB of RAM and 160 GB of local scratch disk and a half-DDR/half-SDR high speed infiniband-network. The cluster was expanded to 212 nodes within a year. However, those nodes were replaced over time and turned off one by one. None of these nodes remain today. At that time it was the main cluster at RRZE, intended for distributed-memory (MPI) or hybrid parallel programs with medium to high communication requirements. It also was the first cluster at RRZE that employed a parallel filesystem (HP SFS) with a capacity of 15 TB and an aggregated parallel I/O bandwidth of > 900 MB/s. That filesystem was retired in 2012.

row of racks with servers

The woody cluster in 2006

The system entered the November 2006 Top500 list on rank 124 and in (November 2007) was ranked number 329.

In 2012, 40 single socket compute nodes with Intel Xeon E3-1280 processors (4-core “SandyBridge”, 3.5 GHz, 8 GB RAM and 400 GB of local scratch disk) have been added (w10xx nodes). These nodes are only connected by GBit ethernet. Therefore, only single-node (or single-core) jobs are allowed in this segment. These nodes have been shutdown in October 2020.

In 2013, 72 single socket compute nodes with Intel Xeon E3-1240 v3 processors (4-core “Haswell”, 3.4 GHz, 8 GB RAM and 900 GB of local scratch disk) have been added (w11xx nodes). These nodes are only connected by GBit ethernet. Therefore, only single-node jobs are allowed in this segment. These nodes replaced three racks full of old w0xxx-nodes, providing significantly more compute power at a fraction of the power usage.

In 2016/2017, 64 single socket compute nodes with Intel Xeon E3-1240 v5 processors (4-core “Skylake”, 3.5 GHz, 32 GB RAM and 900 GB of local scratch disk) have been added (w12xx/w13xx nodes). Only single-node jobs are allowed in this segment.

In autumn 2019, 112 single socket compute nodes with Intel Xeon E3-1240 v6 processors (4-core “Kabylake”, 3.7 GHz, 32 GB RAM and 900 GB of local scratch SSD) have been added (w14xx/w15xx nodes). Only single-node jobs are allowed in this segment, too.

Although Woody was originally a system that was designed for running parallel programs using significantly more than one node, the communications network is pretty weak compared to our other clusters and today’s standards. It is therefore now intended for running single node jobs. In other words, you cannot reserve single cores, the minimum allocation is one node. In the w10xx segment, also single cores can be requested as an exception.