GPU Performance Engineering
Course Description
Porting code to the GPU can yield significant speedups but often presents challenges. This advanced course introduces NVIDIA’s profiling tools to identify common performance issues during the porting process. Performance analysis is guided by straightforward, resource-based models that help developers evaluate how close their code is to the optimal performance target.
The course has undergone restructuring and extension at the beginning of 2025.
We offer a comprehensive GPU Performance Engineering course, along with a condensed GPU Performance Analysis module that can be incorporated into larger events.
Learning Objectives
This course focuses on assessing the performance of GPU-accelerated applications using NVIDIA’s profiling tools, including:
- GPU architecture review
- Using NVTX markers to instrument GPU-accelerated applications
- The Nsight Systems command line interface for summarizing application-level behavior
- The Nsight Systems GUI for visualizing a timeline of the entire application
- The Nsight Compute command line interface for focusing on performance aspects of individual kernels
- The Nsight Compute GUI for obtaining a comprehensive view of kernel performance
Participants will follow live demonstrations and conduct hands-on exercises using the NHR@FAU clusters, gaining practical experience to reinforce the concepts learned.
Certification
A certificate of participation will be awarded to all participants who actively engage in the course.
Prerequisites
Participants should meet the following requirements:
- A basic understanding of programming in C++
- Experience with GPU programming using one or more of the following: CUDA, OpenMP, OpenACC
- Familiarity with compiling applications using a command-line compiler
Upcoming Iterations and Additional Courses
You can find dates and registration links for this and other upcoming NHR@FAU courses at https://hpc.fau.de/teaching/tutorials-and-courses/.