Performance Engineering for Linear Solvers

This tutorial covers code analysis, performance modeling, and optimization for linear solvers on CPU and GPU nodes. Performance Engineering is often taught using simple loops as instructive examples for performance models and how they can guide optimization; however, full, preconditioned linear solvers comprise multiple back-to-back loops enclosed in an iteration scheme that is executed until convergence is achieved. Consequently, the concept of “optimal performance” has to account for both hardware resource efficiency and iterative solver convergence.

We convey a performance engineering process that is geared towards linear iterative solvers. After introducing basic notions of hardware organization and storage for dense and sparse data structures, we show how the Roofline performance model can be applied to such solvers in predictive and diagnostic ways and how it can be used to assess the hardware efficiency of a solver, covering important corner cases such as pure memory boundedness. Then we advance to the structure of preconditioned solvers, using the Conjugate Gradient Method (CG) algorithm as a leading example. Hotspots and bottlenecks of the complete solver are identified followed by the introduction of advanced performance optimization techniques like preconditioning and cache blocking.

Level: Advanced

Language: English

Price and Eligibility: Refer to the registration page for each event (generally free of charge for members of academia from Europe).

  • 2026, Jun 22: half-day on-site tutorial in Hamburg, Germany in collaboration with TU Delft, TU Munich, as part of ISC High Performance
  • 2025, Nov 16: half-day on-site tutorial in St. Louis, MO, USA in collaboration with TU Delft, TU Munich, as part of SC25
  • 2025, Jun 13: half-day on-site tutorial in Hamburg, Germany in collaboration with TU Delft, TU Munich, as part of ISC High Performance
  • 2024, Nov 18: half-day on-site tutorial in Atlanta, GA, USA in collaboration with TU Delft, TU Munich, as part of SC24
  • 2024, May 12: full-day on-site tutorial in Hamburg, Germany in collaboration with TU Delft, TU Munich, as part of ISC High Performance

For an overview of all NHR@FAU courses, visit the course overview page.