Many developers invest heavily in parallelism while overlooking the efficiency of their serial code. Slow serial code tends to scale well in terms of parallel speedup, which can mask the fact that compute resources are being wasted. This course closes that gap by conveying a thorough understanding of the interactions between software and hardware at the level of a single CPU core and the L1 cache. Topics include superscalar out-of-order execution, instruction throughput, critical path and loop-carried dependencies, and the architectural differences between x86 and ARM processors. Participants also learn to read and interpret compiler-generated assembly in AT&T (x86) and AArch64 syntax, and to apply the Open Source Architecture Code Analyzer (OSACA) together with the Compiler Explorer to assess and model performance properties.
Level: Advanced
Language: English (German upon request for bespoke courses)
Price and Eligibility: Refer to the registration page for each event (generally free of charge for members of academia from Europe).
Knowledge
- Programming experience in C or C++ at a level sufficient to read and understand simple loop kernels
- A basic understanding of how CPUs work (registers, instruction execution, data transfers) and what ‘machine instructions’ are.
- A basic understanding of the Roofline model is recommended. You can find some information here (lecture slides) and here (publication by S. Williams).
- Some experience in using the Compiler Explorer is recommended but not required. You can watch a two-part intro by Matt Godbolt: part 1 part 2
Technical
- A modern web browser
- 2026, Oct 5: full-day online course (Register)
- 2026, Jan 31: full-day on-site tutorial in Sydney, Australia, as part of CGO26
- 2025, Nov 17: half-day on-site tutorial in St. Louis, MO, USA, as part of SC25
- 2025, Oct 6: full-day online course
- 2025, Jun 13: half-day on-site tutorial in Hamburg, Germany, as part of ISC High Performance
- 2024, Nov 18: half-day on-site tutorial in Atlanta, GA, USA, as part of SC24
- 2024, Oct 8: full-day online course
- 2024, Sep 8: full-day on-site tutorial in Ostrava, Czech Republic, as part of PPAM 2024
- 2023, Oct 21: full-day on-site tutorial in Vienna, Austria, as part of PACT 2023
- 2023, Oct 12: full-day on-site course at NHR@FAU
- 2023, Apr 16: full-day on-site tutorial in Coimbra, Portugal, as part of ICPE 2023
For an overview of all NHR@FAU courses, visit the course overview page.