Performance Analysis on GPUs with NVIDIA Tools

This course provides a practical introduction to GPU performance analysis using NVIDIA Nsight Systems for application-level timeline profiling and Nsight Compute for individual kernel assessment. Code examples are provided in C++ with OpenMP offloading and CUDA.

This course was discontinued in 2025 and replaced by the significantly extended GPU Performance Engineering course. NHR@FAU also offers a condensed two-hour GPU Performance Analysis module for integration into summer schools and other larger events.

Level: Intermediate to advanced

Language: English (German upon request for bespoke courses)

Price and Eligibility: Refer to the registration page for each event (generally free of charge for members of academia from Europe).

Knowledge

  • Experience with GPU programming in CUDA or OpenMP offloading using C/C++
  • Familiarity with compiling and running code from the command line

Technical

  • SSH setup to remotely access NHR@FAU’s HPC clusters
  • A local installation of NVIDIA Nsight Systems and Nsight Compute (no local GPU required)

After completing this course, you will be able to:

  • Use Nsight Systems to capture and analyze application-level GPU timelines
  • Use Nsight Compute to assess individual CUDA kernel performance

  • Application timeline analysis with Nsight Systems
  • Kernel-level profiling with Nsight Compute
  • Case study: neural network calculations on the GPU

  • 2024, Oct 9: half-day online course
  • 2024, Mar 19: full-day online course
  • 2023, Oct 10: full-day online course
  • 2023, Apr 4: full-day online course
  • 2022, Sep 29: full-day online course

For an overview of all NHR@FAU courses, visit the course overview page.