Dr. Sebastian Kuckuk

Erlangen National High Performance Computing Center
Head of Training

Room 1.131
Martensstraße 1
91058 Erlangen

Fax number: +49 9131 302941
Email: sebastian.kuckuk@fau.de
Website: https://hpc.fau.de/person/dr-sebastian-kuckuk/

Sebastian holds a PhD in computer science from FAU, where he currently works as a head of training for NHR@FAU. His primary interests revolve around enhancing performance portability and programmer productivity through the use of domain-specific programming languages, code generation techniques, automatic parallelization, and GPU programming. He applies these methodologies in the context of massively parallel numerical solvers for computational fluid applications and other related fields.

In 2021, Sebastian joined NHR@FAU as a liaison scientist for the Chair of System Simulation. He simultaneously became a member of the Training and Support division, which he joined as a full member in spring of 2024. In addition, he serves as an NVIDIA Deep Learning Institute (DLI) university ambassador and is certified to teach DLI courses covering both introductory and advanced concepts in GPU programming. Sebastian contributes his expertise by engaging in smaller project-driven consultation activities, conducting lecture exercises, and delivering single- and multi-day tutorials. He serves as head of training since summer 2025.

During his free time, Sebastian finds pleasure in cooking and biking. He also pursues his interests in learning the Japanese language and exploring topics in psychology and philosophy.

Sebastian provides support and consulting for KONWIHR and NHR projects revolving around GPU programming, performance analysis and optimization.

Sebastian offers multiple courses from the NVIDIA Deep Learning Institute (DLI) portfolio revolving around GPU programming. He attained certification as DLI university ambassador and is certified to teach the following courses:

Fundamentals of Accelerated Computing with CUDA C/C++ (course retired in 2025)
Fundamentals of Accelerated Computing with Modern CUDA C++
Fundamentals of Accelerated Computing with CUDA Python
Fundamentals of Accelerated Computing with OpenACC (course retired in 2025)
Accelerating CUDA C++ Applications with Multiple GPUs (course currently on hold)
Scaling CUDA C++ Applications to Multiple Nodes (course currently on hold)

He also conducts the following trainings:

GPU Performance Engineering and GPU Performance Analysis
Choosing GPU Programming Approaches
Introduction to OpenMP

A list of these and all other courses offered by NHR@FAU as well as upcoming and past course dates are available at https://go-nhr.de/trainings .

Sebastian contributes his expertise to the following lectures and exercises:

High-End Simulation in Practice (HESP)
Programming Techniques for Supercomputers (PTfS)

Sebastian was a maintainer and developer of

the ExaStencils code generation framework for massively parallel multigrid solvers, and
the GHODDESS module for quadrature free higher-order discretizations of the shallow water equations.

Further information on both projects can be found on the official page https://www.exastencils.fau.de/ .

2025

Faghih-Naini S., Aizinger V., Kuckuk S., Angersbach R., Köstler H.:
p-adaptive discontinuous Galerkin method for the shallow water equations on heterogeneous computing architectures
In: GEM - International Journal on Geomathematics 16 (2025), Article No.: 8
ISSN: 1869-2672
DOI: 10.1007/s13137-025-00267-2

2024

Angersbach R., Köstler H., Kuckuk S.:
Code Generation for Octree-Based Multigrid Solvers with Fused Higher-Order Interpolation and Communication
Euro-Par 2024 (Madrid, 2024-08-26 - 2024-08-30)
DOI: 10.1007/978-3-031-69583-4_17

2023

Angersbach R., Kuckuk S., Köstler H.:
Generating Coupling Interfaces for Multiphysics Simulations with ExaStencils and waLBerla
International Parallel and Distributed Processing Symposium (IPDPS) (St. Petersburg, Florida USA, 2023-05-15 - 2023-05-19)
In: 2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Los Alamitos, CA, USA: 2023
DOI: 10.1109/IPDPSW59300.2023.00112
URL: https://ieeexplore.ieee.org/document/10196550
Faghih-Naini S., Kuckuk S., Zint D., Kemmler S., Köstler H., Aizinger V.:
Discontinuous Galerkin method for the shallow water equations on complex domains using masked block-structured grids
In: Advances in Water Resources 182 (2023), Article No.: 104584
ISSN: 0309-1708
DOI: 10.1016/j.advwatres.2023.104584

2022

Angersbach R., Köstler H., Kuckuk S.:
Fusion of Massively-Parallel Simulation Frameworks and Code Generation Methodologies for Lattice Boltzmann and Multigrid Applications
Platform for Advanced Scientific Computing (PASC) Conference (Congress Center Basel, Switzerland., 2022-06-27 - 2022-06-29)
URL: https://pasc22.pasc-conference.org/fileadmin/user_upload/pasc22/pdf/P41_pos144s2-file1.pdf
Zint D., Grosso R., Aizinger V., Faghih-Naini S., Kuckuk S., Köstler H.:
Automatic Generation of Load-Balancing-Aware Block-Structured Grids for Complex Ocean Domains.
2022 SIAM International Meshing Roundtable
In: Robinson, Trevor ; Moxey, David ; Tomov, Vladimir Z. (ed.) (ed.): Proceedings of the 2022 SIAM International Meshing Roundtable 2022
DOI: 10.5281/zenodo.6562440

2021

Schmitt J., Kuckuk S., Köstler H.:
EvoStencils: a grammar-based genetic programming approach for constructing efficient geometric multigrid methods
In: Genetic Programming and Evolvable Machines (2021)
ISSN: 1389-2576
DOI: 10.1007/s10710-021-09412-w
URL: http://link.springer.com/article/10.1007/s10710-021-09412-w

2020

Faghih-Naini S., Kuckuk S., Aizinger V., Zint D., Grosso R., Köstler H.:
Quadrature-free discontinuous Galerkin method with code generation features for shallow water equations on automatically generated block-structured meshes
In: Advances in Water Resources 138 (2020), Article No.: 103552
ISSN: 0309-1708
DOI: 10.1016/j.advwatres.2020.103552
Köstler H., Heisig M., Kohl N., Kuckuk S., Bauer M., Rüde U.:
Code generation approaches for parallel geometric multigrid solvers
In: Analele Stiintifice ale Universitatii Ovidius Constanta, Seria Matematica 28 (2020), p. 123-152
ISSN: 1844-0835
DOI: 10.2478/auom-2020-0038
Lengauer C., Apel S., Bolten M., Chiba S., Rüde U., Teich J., Größlinger A., Hannig F., Köstler H., Claus L., Grebhahn A., Groth S., Kronawitter S., Kuckuk S., Rittich H., Schmitt C., Schmitt J.:
ExaStencils: Advanced multigrid solver generation
In: Hans-Joachim Bungartz, Severin Reiz, Benjamin Uekermann, Philipp Neumann, Wolfgang E. Nagel (ed.): Lecture notes in computational science and engineering, Cham: Springer, 2020, p. 405-452 (Software for Exascale Computing SPPEXA 2016 – 2019, Vol.136)
ISBN: 978-3-030-47955-8
DOI: 10.1007/978-3-030-47956-5
URL: https://library.oapen.org/bitstream/handle/20.500.12657/41289/2020_Book_SoftwareForExascaleComputing-S.pdf?sequence=1#page=411
Lengauer C., Apel S., Bolten M., Chiba S., Rüde U., Teich J., Größlinger A., Hannig F., Köstler H., Claus L., Grebhahn A., Groth S., Kronawitter S., Kuckuk S., Rittich H., Schmitt C., Schmitt J.:
ExaStencils – Advanced Multigrid Solver Generation
In: Hans-Joachim Bungartz, Severin Reiz, Philipp Neumann, Benjamin Uekermann, Wolfgang Nagel (ed.): Software for Exascale Computing – SPPEXA 2016-2019, Springer, 2020, p. 405-452 (Lecture Notes in Computer Science and Engineering, Vol.136)
ISBN: 978-3-030-47955-8
DOI: 10.1007/978-3-030-47956-5_14
URL: https://www12.cs.fau.de/downloads/hannig/publications/ExaStencils_Advanced_Multigrid_Solver_Generation.pdf
Schmitt J., Kuckuk S., Köstler H.:
Constructing Efficient Multigrid Solvers with Genetic Programming
Genetic and Evolutionary Computation Conference (GECCO '20) (Cancún, 2020-07-08 - 2020-07-12)
In: Association for Computing Machinery (ed.): Proceedings of the 2020 Genetic and Evolutionary Computation Conference, New York, NY, USA: 2020
DOI: 10.1145/3377930.3389811

2019

Kuckuk S.:
Automatic Code Generation for Massively Parallel Applications in Computational Fluid Dynamics (Dissertation, 2019)
URL: https://opus4.kobv.de/opus4-fau/frontdoor/index/index/docId/13050
Schmitt J., Kuckuk S., Köstler H.:
Towards the automatic optimization of geometric multigrid methods with evolutionary computation
19th Copper Mountain Conference On Multigrid Methods (Copper Mountain, Colorado, 2019-03-24 - 2019-03-28)
DOI: 10.29007/1c29
URL: https://easychair.org/smart-slide/slide/7g69

2018

Kuckuk S., Köstler H.:
Code Generation for Massively Parallel PDE Solvers
Computational Science at Scale (CoSaS) (Erlangen, 2018-09-05 - 2018-09-07)
Kuckuk S., Köstler H.:
Generation of Highly Parallel Multigrid Solvers for CFD Applications
SIAM Conference on Parallel Processing for Scientific Computing 2018 (Tokio, 2018-03-07 - 2018-03-10)
Kuckuk S., Köstler H.:
Towards Whole Program Generation for Ocean Modeling
PASC'18 (Basel, 2018-07-02 - 2018-07-04)
Kuckuk S., Köstler H.:
Whole Program Generation of Massively Parallel Shallow Water Equation Solvers
2018 IEEE International Conference on Cluster Computing (CLUSTER) (Belfast, 2018-09-10 - 2018-09-13)
In: 2018 IEEE International Conference on Cluster Computing (CLUSTER) 2018
DOI: 10.1109/CLUSTER.2018.00020
URL: https://ieeexplore.ieee.org/document/8514861
Schmitt C., Schmid M., Kuckuk S., Köstler H., Teich J., Hannig F.:
Reconfigurable Hardware Generation of Multigrid Solvers with Conjugate Gradient Coarse-Grid Solution
In: Parallel Processing Letters 28 (2018), Article No.: 1850016
ISSN: 0129-6264
DOI: 10.1142/S0129626418500160

2017

Kuckuk S., Haase G., Vasco DA., Köstler H.:
Towards Generating Efficient Flow Solvers with the ExaStencils Approach
In: Concurrency and Computation-Practice & Experience 29 (2017), p. 4062:1-4062:17
ISSN: 1532-0626
DOI: 10.1002/cpe.4062
URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.4062
Kuckuk S., Köstler H.:
Metaprogramming for Unstructured Mesh Applications in Ocean Modeling
SIAM Conference on Computational Science and Engineering 2017 (Atlanta, 2017-02-27 - 2017-03-03)
Kuckuk S., Köstler H.:
Whole Program Generation for Complex Fluid Flow Solvers
PASC'17 (Lugano, 2017-06-26 - 2017-02-28)
Kuckuk S., Leitenmaier L., Schmitt C., Schönwetter D., Köstler H., Fey D.:
Towards Virtual Hardware Prototyping for Generated Geometric Multigrid Solvers
CS 2017-01 (2017), p. 1-8
ISSN: 2191-5008
Open Access: http://nbn-resolving.de/urn:nbn:de:bvb:29-opus4-83179
URL: http://nbn-resolving.de/urn:nbn:de:bvb:29-opus4-83179
(Techreport)
Köstler H., Schmitt C., Kuckuk S., Kronawitter S., Hannig F., Teich J., Rüde U., Lengauer C.:
A Scala Prototype to Generate Multigrid Solver Implementations for Different Problems and Target Multi-Core Platforms
In: International Journal of Computational Science and Engineering 14 (2017), p. 150-163
ISSN: 1742-7185
DOI: 10.1504/IJCSE.2017.10003829

2016

Kronawitter S., Kuckuk S., Lengauer C.:
Redundancy Elimination in the ExaStencils Code Generator
First International Workshop on Data Locality in Modern Computing Systems (DLMCS 2016) (Granada, Spain, 2016-12-14 - 2016-12-16)
In: Proceedings of the First International Workshop on Data Locality in Modern Computing Systems (DLMCS), Berlin, Heidelberg, New York: 2016
DOI: 10.1007/978-3-319-49956-7_13
Kuckuk S.:
Challenges in Fully Generating Multigrid Solvers for the Simulation of non-Newtonian Fluids
HiStencils 2016 (Prag, 2016-01-18 - 2016-01-18)
Kuckuk S., Köstler H.:
Automatic Code Generation for Simulating Non-Newtonian Fluid Flows with ExaStencils
SIAM Conference on Parallel Processing 2016 (Paris, 2016-04-12 - 2016-04-15)
Kuckuk S., Köstler H.:
Automatic Code Generation for Simulations of Non-Newtonian Fluids
PASC'16 (Lausanne, 2016-06-08 - 2016-06-10)
Kuckuk S., Köstler H.:
Automatic Generation of Massively Parallel Codes from ExaSlang
In: Computation 4 (2016), p. 1-20
ISSN: 2079-3197
Open Access: http://www.mdpi.com/2079-3197/4/3/27
URL: http://www.mdpi.com/2079-3197/4/3/27/pdf
Schmitt C., Kuckuk S., Hannig F., Teich J., Köstler H., Rüde U., Lengauer C.:
Systems of Partial Differential Equations in ExaSlang
In: Software for Exascale Computing - SPPEXA 2013-2015, Berlin, Heidelberg, New York: Springer, 2016, p. 47-67 (Lecture Notes in Computational Science and Engineering, Vol.113)
ISBN: 9783319405261
DOI: 10.1007/978-3-319-40528-5_3

2015

Kuckuk S., Schmitt C., Kronawitter S.:
ExaSlang and the ExaStencils Code Generator
PASC'15 (Zürich, 2015-06-01 - 2015-06-03)
Schmitt C., Schmid M., Hannig F., Teich J., Kuckuk S., Köstler H.:
Generation of Multigrid-based Numerical Solvers for FPGA Accelerators
2nd International Workshop on High-Performance Stencil Computations (HiStencils) (Amsterdam, 2015-01-20 - 2015-01-20)
In: Proceedings of the 2nd International Workshop on High-Performance Stencil Computations (HiStencils) 2015
URL: https://www12.cs.fau.de/downloads/schmittch/publications/SSHTKK15histencils.pdf

2014

Grebhahn A., Kuckuk S., Schmitt C., Köstler H., Siegmund N., Apel S., Hannig F., Teich J.:
Experiments on Optimizing the Performance of Stencil Codes with SPL Conqueror
In: Parallel Processing Letters 24 (2014)
ISSN: 0129-6264
DOI: 10.1142/S0129626414410011
Grebhahn A., Siegmund N., Apel S., Kuckuk S., Schmitt C.:
Optimizing the Performance of Customizable Stencil Codes with Feature-Interaction Detection
Grebhahn A., Siegmund N., Apel S., Kuckuk S., Schmitt C., Köstler H.:
Optimizing Performance of Stencil Code with SPL Conqueror
1st International Workshop on High-Performance Stencil Computations (HiStencils) (Vienna, 2014-01-20 - 2014-01-20)
In: Proceedings of the 1st International Workshop on High-Performance Stencil Computations (HiStencils) 2014
URL: https://www12.cs.fau.de/downloads/schmittch/publications/GSAKSK14histencils.pdf
Kuckuk S.:
Generating Data Structures and Communication for Highly Parallel Geometric Multigrid Solvers
SPPEXA Doktorandenkolloquium (Erlangen, 2014-07-02 - 2014-07-02)
Kuckuk S., Gmeiner B., Köstler H., Rüde U.:
A Generic Prototype to Benchmark Algorithms and Data Structures
In: Parallel Computing: Accelerating Computational Science and Engineering (CSE), Berlin: IOS Press, 2014, p. 813-822 (Advances in Parallel Computing, Vol.25)
ISBN: 978-1-61499-380-3
DOI: 10.3233/978-1-61499-381-0-813
URL: http://ebooks.iospress.nl/volumearticle/35957
Kuckuk S., Schmitt C., Köstler H., Hannig F., Teich J.:
Generating Highly Parallel Geometric Multigrid Solvers with the ExaStencils Apporach
3rd Workshop on Extreme-scale Programming Tools (New Orleans, 2014-11-17 - 2014-11-17)
Köstler H., Kuckuk S.:
Automatic Generation of Algorithms and Data Structures for Geometric Multigrid
SIAM Conference on Parallel Processing for Scientific Computing (Portland, 2014-02-18 - 2014-02-21)
Lengauer C., Apel S., Bolten M., Größlinger A., Hannig F., Köstler H., Rüde U., Teich J., Grebhahn A., Kronawitter S., Kuckuk S., Rittich H., Schmitt C.:
ExaStencils: Advanced Stencil-Code Engineering - First Project Report
(2014)
Open Access: http://www.fim.uni-passau.de/fileadmin/files/forschung/mip-berichte/MIP1401.pdf
(Techreport)
Lengauer C., Apel S., Größlinger A., Grebhahn A., Kronawitter S., Bolten M., Rittich H., Hannig F., Köstler H., Rüde U., Teich J., Kuckuk S., Schmitt C.:
ExaStencils: Advanced Stencil-Code Engineering
Euro-Par: Parallel Processing Workshops (Porto, 2014-08-25 - 2014-08-26)
In: Proceedings of Euro-Par 2014: Parallel Processing Workshops, Berlin; Heidelberg: 2014
DOI: 10.1007/978-3-319-14313-2_47
URL: http://link.springer.com/content/pdf/10.1007/978-3-319-14313-2_47.pdf
Schmitt C., Kuckuk S., Hannig F., Köstler H., Teich J.:
ExaSlang: A Domain-Specific Language for Highly Scalable Multigrid Solvers
4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC) (New Orleans, LA, USA, 2014-11-17 - 2014-11-17)
In: Proc. of the 4th International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC), New York, NY, USA: 2014
DOI: 10.1109/WOLFHPC.2014.11
Schmitt C., Kuckuk S., Köstler H., Hannig F., Teich J.:
An Evaluation of Domain-Specific Language Technologies for Code Generation
14th International Conference on Computational Science and its Applications (ICCSA) (Minho, Guimaraes, 2014-06-30 - 2014-07-03)
In: Proc. of the 14th International Conference on Computational Science and its Applications (ICCSA), New York, NY, USA: 2014
DOI: 10.1109/ICCSA.2014.16

2013

Kuckuk S., Köstler H.:
A Framework for Interactive Physical Simulations on Remote HPC Clusters
CS-2013-06 (2013), p. 1-10
ISSN: 2191-5008
Open Access: https://opus4.kobv.de/opus4-fau/frontdoor/index/index/docId/5024
URL: http://opus4.kobv.de/opus4-fau/files/5024/Kuckuk_2013_VIPER.pdf
(Techreport)
Kuckuk S., Preclik T., Köstler H.:
Interactive particle dynamics using OpenCL and Kinect
In: International Journal of Parallel, Emergent and Distributed Systems 28 (2013), p. 518-536
ISSN: 1744-5760
DOI: 10.1080/17445760.2012.745671
URL: http://www.tandfonline.com/doi/pdf/10.1080/17445760.2012.745671

Automatic Code Generation for Massively Parallel Applications in Computational Fluid Dynamics

An open access version of the thesis is available here (PDF).

Abstract

Solving partial differential equations (PDEs) is a fundamental challenge in many application domains in industry and academia alike. With increasingly large problems, efficient and highly scalable implementations become more and more crucial. Today, facing this challenge is more difficult than ever due to the increasingly heterogeneous hardware landscape. One promising approach is developing domain‐specific languages (DSLs) for a set of applications. Using code generation techniques then allows targeting a range of hardware platforms while concurrently applying domain‐specific optimizations in an automated fashion. The present work aims to further the state of the art in this field. As domain, we choose PDE solvers and, in particular, those from the group of geometric multigrid methods. To avoid having a focus too broad, we restrict ourselves to methods working on structured and patch‐structured grids.

We face the challenge of handling a domain as complex as ours, while providing different abstractions for diverse user groups, by splitting our external DSL ExaSlang into multiple layers, each specifying different aspects of the final application. Layer 1 is designed to resemble LaTeX and allows inputting continuous equations and functions. Their discretization is expressed on layer 2. It is complemented by algorithmic components which can be implemented in a Matlab‐like syntax on layer 3. All information provided to this point is summarized on layer 4, enriched with particulars about data structures and the employed parallelization. Additionally, we support automated progression between the different layers. All ExaSlang input is processed by our jointly developed Scala code generation framework to ultimately emit C++ code. We particularly focus on how to generate applications parallelized with, e.g., MPI and OpenMP that are able to run on workstations and large‐scale cluster alike.

We showcase the applicability of our approach by implementing simple test problems, like Poisson’s equation, as well as relevant applications from the field of computational fluid dynamics (CFD). In particular, we implement scalable solvers for the Stokes, Navier‐Stokes and shallow water equations (SWE) discretized using finite differences (FD) and finite volumes (FV). For the case of Navier‐Stokes, we also extend our implementation towards non‐uniform grids, thereby enabling static mesh refinement, and advanced effects such as the simulated fluid being non‐Newtonian and non‐isothermal.

Dr. Sebastian Kuckuk