The Roofline model is arguably the simplest but also the most successful performance model in High Performance Computing. Using some assumptions about a code and the hardware it is running on, it allows to calculate an upper limit for the performance of a loop. This makes it an indispensable analysis device: Comparing the expected upper limit with the actual performance, hardware bottlenecks and deficiencies in the code can be identified, which points to optimization opportunities.
Although several tools exist that can help with Roofline analysis on CPUs and GPUs, the Roofline model requires a basic understanding of computer architecture, code execution, and hardware bottlenecks to be useful. This tutorial provides the necessary knowledge, backed up by case studies and hands-on exercises, to let participants use the Roofline model as a powerful, scientifically well-founded analysis tool for CPU and GPU code. We also point out the strengths and weaknesses of various performance tools and how they can be of use in different scenarios and for different groups of developers and analysts.
Level: Beginner
Language: English (German upon request for bespoke courses)
Price and Eligibility: Refer to the registration page for each event (generally free of charge for members of academia from Europe).
- A modern web browser (for JupyterHub access to NHR@FAU’s HPC clusters)
- Compute node architecture and bottlenecks – CPUs and GPUs
- Thinking in rooflines: code and machine characterization
- Hardware performance counters and profiling tools
- Case studies and examples: Sparse MVM, CG solver, molecular dynamics proxy app, lattice-Boltzmann code
For an overview of all NHR@FAU courses, visit the course overview page.