NHR@FAU offers a broad range of GPU programming courses, from first-contact introductions to CUDA and high-level programming models all the way to advanced multi-GPU scaling and performance engineering with NVIDIA’s profiling tools. Courses are developed and maintained by NHR@FAU staff and regularly updated to reflect current hardware and software.
- 2026, Jul 17: GPU Performance Analysis
in Perth, Australia, half-day - 2026, Sep 3-4: Introduction to CUDA C/C++
online, two half-day
Registration Link - 2026, Sep 7-9: Scaling CUDA-Accelerated Applications
online, three-day
Registration Link - 2026, Sep 28-30: GPU Performance Engineering
online, three half-day
Registration Link - 2026, Oct 27-29: Fundamentals of Accelerated Computing with Modern CUDA C++
online, three half-day
Registration Link - 2026, Nov 9-10: Choosing GPU Programming Approaches
online, two half-day
Registration Link
- Choosing GPU Programming Approaches
Overview of common GPU programming approaches – CUDA and HIP, OpenMP and OpenACC offloading, SYCL, Kokkos, Thrust, and standard C++ parallel algorithms – to help practitioners choose the right tool for their application. - Fundamentals of Accelerated Computing with Modern CUDA C++
Introduces GPU acceleration of C++ applications using modern CUDA features, parallel algorithms, and CUDA streams. - Fundamentals of Accelerated Computing with OpenMP and Kokkos
One-day course combining an introduction to GPU programming with OpenMP offloading and the Kokkos performance-portability library. - GPU Performance Analysis
Condensed two-hour GPU performance analysis module for integration into summer schools and other larger events. - GPU Performance Engineering
Introduces NVIDIA’s profiling tools together with roofline and resource-based performance models to identify, quantify, and resolve GPU performance bottlenecks. - Introduction to CUDA C/C++
Hands-on introduction to GPU programming with CUDA C/C++, covering the programming model, porting strategies, GPU architecture, and optimization patterns using interactive Jupyter notebooks. - Scaling CUDA-Accelerated Applications
Advanced course on scaling CUDA-accelerated applications from single-node multi-GPU to multi-node deployments.
- Accelerating CUDA C++ Applications with Multiple GPUs (discontinued)
Advanced techniques for extending single-GPU CUDA C++ applications to utilize multiple GPUs within a single compute node. - Fundamentals of Accelerated Computing with CUDA C/C++ (discontinued)
Hands-on introduction to GPU programming in C/C++ with CUDA, covering data parallelism, memory optimization, and profiling. - Fundamentals of Accelerated Computing with CUDA Python (discontinued)
Hands-on introduction to GPU-accelerating Python applications using CUDA and Numba, covering device kernels and memory optimization. - Fundamentals of Accelerated Computing with OpenACC (discontinued)
Introduction to GPU-accelerated computing using the OpenACC directive-based programming model. - Performance Analysis on GPUs with NVIDIA Tools (discontinued)
Half-day course introducing NVIDIA profiling tools and resource-based performance models for GPU performance analysis. - Scaling CUDA C++ Applications to Multiple Nodes (discontinued)
Advanced multi-node GPU programming using MPI and NVSHMEM to distribute CUDA workloads across cluster nodes.
For an overview of all NHR@FAU courses, visit the course overview page.