Tools and Libraries
A number of software tools and libraries are being developed in our group.
LIKWID tool suite
LIKWID is a node-level tool suite and library for performance-aware developers. It features a collection of useful command-line tools for topology exploration, affinity control, hardware performance monitoring, hardware configuration, and microbenchmarking.
Kerncraft is a loop kernel analysis and performance modeling toolkit. It allows automatic analysis of loop kernels using the Execution Cache Memory (ECM) model and the Roofline model, and their validation via actual benchmarks. Kerncraft provides a framework for investigating the data reuse and cache requirements by static code analysis. In combination with the Intel IACA tool, kerncraft can give a good overview of both in-core and memory bottlenecks and use that data to apply performance models. In case of stencil codes it can use its built-in layer condition analyzer to automatically generate tuning advice, i.e., determine favorable loop blocking factors in order to reduce the code balance. Kerncraft contains a python-based cache hierarchy simulator that is also available as a standalone tool.
GHOST is the “General, Hybrid, and Optimized Sparse Toolkit.” It provides basic building blocks for computations with very large sparse or dense matrices. GHOST is being developed as part of the ESSEX project under the umbrella of the Priority Programme 1648: Software for Exascale Computing (SPPEXA) of the German Research Foundation (DFG). The library is able to deal with systems containing standard multicore CPUs, Nvidia GPGPUs, and Intel Xeon Phis, and supports heterogeneous parallelism across all three architectures in the same program. GHOST is running successfully on current post-petascale systems such as Oakforest-PACS at the University of Tokyo (Top500 #7 in June 2017) or Piz Daint at the Swiss National Supercomputing Center (CSCS) in Lugano (Top500 #3 in June 2017).
The CRAFT library (Checkpoint/Restart and Automatic Fault Tolerance) provides an easy-to-use interface to checkpoint/restart and dynamic process recovery capabilities. Both of these features can be used independently as well as combined. CRAFT is being developed as part of the ESSEX project under the umbrella of the Priority Programme 1648: Software for Exascale Computing (SPPEXA) of the German Research Foundation (DFG).
The Open Source Architecture Code Analyzer (OSACA) is a tool that can analyze assembly or machine code and produce a best-case (throughput-limited) runtime prediction assuming that the data is in the L1 cache. Such a tool is sorely needed for processor architectures other than Intel’s. Intel provides the Intel Architecture Code Analyzer (IACA) for free, but it is not open source and its future development path is unclear. OSACA can do some things that IACA can’t, such as, e.g., analyze non-compiled assembly code or extend its own database with new instructions.
Why such a tool? Analytic performance models, such as the ECM model, depend on an accurate assessment of in-core execution performance. You can either do that manually by code (source or assembly) inspection, or you can use a tool that knows the instruction set and the limitations of a particular microarchitecture. The data flow analysis must be done by someone else – again, it’s either your brain or, e.g., our Kerncraft tool.