ClusterCockpit stable release 1.0.0 is out

ClusterCockpit logo

We are happy and proud to announce the initial stable release of cc-backend, the central Web UI and API backend for the ClusterCockpit job-specific monitoring framework. At the same time, we have launched a ClusterCockpit website.

ClusterCockpit is a job-specific performance and power monitoring framework for HPC clusters. For cluster users it provides a detailed, time-resolved overview over their jobs’ behavior, including hardware metrics such as memory bandwidth, floating-point performance, vectorization ratio, load balance, communication metrics, and more. Performance engineers can use the data to find optimization opportunities or starting points for further investigation. And finally, cluster administrators get an overview of utilization metrics, job statistics, and efficient use of resources including power.

The NHR center NHR@FAU is the lead developer with contributions from several other HPC centers. All ClusterCockpit components are made available under the MIT open source license and are freely available for download from our GitHub page. In addition to software development, the ClusterCockpit project is also actively working on establishing format and interface standards for HPC monitoring environments. ClusterCockpit is funded by BMBF Germany as part of the EE-HPC project.

ClusterCockpit is a job-specific performance and power monitoring framework for HPC clusters.
Cutout of a job-specific performance and power monitoring.

The focus of ClusterCockpit monitoring framework is on simple installation and maintenance, high security and intuitive usage. Besides cc-backend ClusterCockpit also includes the cc-metric-collector (a node agent for measuring, processing and forwarding node level metrics), and cc-metric-store (a simple in-memory metric store).

ClusterCockpit can be used as a complete integrated monitoring solution, but components can also be integrated with external components. All components are developed in the Go programming language. The web frontend UI is implemented as reusable Svelte components.

ClusterCockpit is stable and used in production at NHR@FAU, PC2 Paderborn, and DKRZ. Contact us if you are interested in deploying ClusterCockpit at your center. We can also help you plan and set up ClusterCockpit at your site.

 

Dr. Jan Eitzinger

Head of Software & Tools

Erlangen National High Performance Computing Center
Software & Tools