This course provides a practical introduction to GPU performance analysis using NVIDIA Nsight Systems for application-level timeline profiling and Nsight Compute for individual kernel assessment. Code examples are provided in C++ with OpenMP offloading and CUDA.
This course was discontinued in 2025 and replaced by the significantly extended GPU Performance Engineering course. NHR@FAU also offers a condensed two-hour GPU Performance Analysis module for integration into summer schools and other larger events.
Level: Intermediate to advanced
Language: English (German upon request for bespoke courses)
Price and Eligibility: Refer to the registration page for each event (generally free of charge for members of academia from Europe).
Knowledge
- Experience with GPU programming in CUDA or OpenMP offloading using C/C++
- Familiarity with compiling and running code from the command line
Technical
- SSH setup to remotely access NHR@FAU’s HPC clusters
- A local installation of NVIDIA Nsight Systems and Nsight Compute (no local GPU required)
After completing this course, you will be able to:
- Use Nsight Systems to capture and analyze application-level GPU timelines
- Use Nsight Compute to assess individual CUDA kernel performance
- Application timeline analysis with Nsight Systems
- Kernel-level profiling with Nsight Compute
- Case study: neural network calculations on the GPU
- 2024, Oct 9: half-day online course
- 2024, Mar 19: full-day online course
- 2023, Oct 10: full-day online course
- 2023, Apr 4: full-day online course
- 2022, Sep 29: full-day online course
For an overview of all NHR@FAU courses, visit the course overview page.