Jan Laukemann

Jan Laukemann, M. Sc.

PhD Student

Central Scientific Institutions
Erlangen National High Performance Computing Center

Room 04.139
Martensstraße 1
91058 Erlangen

Email: jan.laukemann@fau.de
Website: https://hpc.fau.de/people

Short bio

Jan Laukemann is a PhD student at the University of Erlangen-Nürnberg (FAU) Erlangen and works for the National High Performance Computing Center (NHR@FAU). Previously he finished his Master’s at FAU and worked as a Research Scientist at Intel Parallel Computing Labs (Intel PCL). He focuses on application optimization and performance engineering for HPC systems and novel algorithms for scalable linear algebra, tensor decomposition and graph computations. His research interests primarily include x86 and non-x86 computer architectures, their performance behavior on the node level, and vectorization techniques. He is the main developer of the Open Source Architecture Code Analyzer (OSACA), a static in-core kernel analysis tool, and is part of the organization committee of the annual HPC-AI Advisory Council Student Cluster Competition at ISC High Performance.

Research fields

List of publications

2026

Laukemann J., Hager G., Wellein G.:
Microarchitectural comparison, in-core modeling, and memory hierarchy analysis of state-of-the-art CPUs: Grace, Sapphire Rapids, and Genoa
In: Parallel Computing 127 (2026), Article No.: 103183
ISSN: 0167-8191
DOI: 10.1016/j.parco.2026.103183

2025

Laukemann J.:
Reproducibility Report for SC25 Paper Bine Trees: Enhancing Collective Operations by Optimizing Communication Locality
2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 (St. Louis, MO, USA, 2025-11-16 - 2025-11-21)
In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 2025
DOI: 10.1145/3712285.3769440
Laukemann J., Helal AE., Anderson SIG., Checconi F., Soh Y., Tithi JJ., Ranadive T., Gravelle BJ., Petrini F., Choi J.:
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized Representation
In: IEEE Transactions on Parallel and Distributed Systems (2025)
ISSN: 1045-9219
DOI: 10.1109/TPDS.2025.3553092
Wolfson-Pou J., Laukemann J., Petrini F.:
MAGNUS: Generating Data Locality to Accelerate Sparse Matrix-Matrix Multiplication on CPUs
ICS '25: 2025 International Conference on Supercomputing Salt Lake City USA (Salt Lake City, 2025-06-08 - 2025-06-11)
In: ICS '25: Proceedings of the 39th ACM International Conference on Supercomputing 2025
DOI: 10.1145/3721145.3725773

2024

Laukemann J., Gruber T., Hager G., Oryspayev D., Wellein G.:
CloverLeaf on Intel Multi-Core CPUs: A Case Study in Write-Allocate Evasion
38th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2024 (San Francisco, CA, 2024-05-27 - 2024-05-31)
In: 2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2024
DOI: 10.1109/IPDPS57955.2024.00038
Laukemann J., Hager G., Wellein G.:
Microarchitectural comparison and in-core modeling of state-of-the-art CPUs: Grace, Sapphire Rapids, and Genoa
SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (Atlanta, 2024-11-17 - 2024-11-22)
In: SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York City: 2024
DOI: 10.1109/SCW63240.2024.00181

2023

Laukemann J., Hager G.:
Core-Level Performance Engineering with the Open-Source Architecture Code Analyzer (OSACA) and the Compiler Explorer
14th Annual ACM/SPEC International Conference on Performance Engineering, ICPE 2023 (Coimbra, 2023-04-15 - 2023-04-19)
In: ICPE 2023 - Companion of the 2023 ACM/SPEC International Conference on Performance Engineering 2023
DOI: 10.1145/3578245.3583716
Ravedutti Lucio Machado R., Eitzinger J., Laukemann J., Hager G., Köstler H., Wellein G.:
MD-Bench: A performance-focused prototyping harness for state-of-the-art short-range molecular dynamics algorithms
In: Future Generation Computer Systems-The International Journal of Grid Computing Theory Methods and Applications 149 (2023), p. 25-38
ISSN: 0167-739X
DOI: 10.1016/j.future.2023.06.023
Ravedutti Lucio Machado R., Eitzinger J., Laukemann J., Hager G., Köstler H., Wellein G.:
MD-Bench: A performance-focused prototyping harness for state-of-the-art short-range molecular dynamics algorithms
In: Future Generation Computer Systems-The International Journal of Grid Computing Theory Methods and Applications (2023)
ISSN: 0167-739X
DOI: 10.1016/j.future.2023.06.023
Soh Y., Helal AE., Checconi F., Laukemann J., Tithi JJ., Ranadive T., Petrini F., Choi JW.:
Dynamic Tensor Linearization and Time Slicing for Efficient Factorization of Infinite Data Streams
International Symposium on Parallel and Distributed Processing (IPDPS) (St. Petersburg, FL, 2023-05-15 - 2023-05-19)
In: 2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2023
DOI: 10.1109/IPDPS54959.2023.00048

2022

Nguyen A., Helal AE., Checconi F., Laukemann J., Tithi JJ., Soh Y., Ranadive T., Petrini F., Choi JW.:
Efficient, out-of-memory sparse MTTKRP on massively parallel architectures
36th ACM International Conference on Supercomputing, ICS 2022 (Online, 2022-06-27 - 2022-06-30)
In: Proceedings of the International Conference on Supercomputing 2022
DOI: 10.1145/3524059.3532363

2021

Alappat C., Meyer N., Laukemann J., Gruber T., Hager G., Wellein G., Wettig T.:
Execution-Cache-Memory modeling and performance tuning of sparse matrix-vector multiplication and Lattice quantum chromodynamics on A64FX
In: Concurrency and Computation-Practice & Experience (2021)
ISSN: 1532-0626
DOI: 10.1002/cpe.6512
URL: https://onlinelibrary.wiley.com/doi/full/10.1002/cpe.6512
Helal AE., Laukemann J., Checconi F., Tithi JJ., Ranadive T., Petrini F., Choi J.:
ALTO: Adaptive Linearized Storage of Sparse Tensors
ICS '21: 2021 International Conference on Supercomputing (Virtual Event, USA, 2021-06-14 - 2021-06-17)
DOI: 10.1145/3447818.3461703
URL: https://dl.acm.org/doi/10.1145/3447818.3461703

2020

Alappat C., Laukemann J., Gruber T., Hager G., Wellein G., Meyer N., Wettig T.:
Performance Modeling of Streaming Kernels and Sparse Matrix-Vector Multiplication on A64FX
2020 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2020 (, 2020-11-12)
In: Proceedings of PMBS 2020: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems 2020
DOI: 10.1109/PMBS51919.2020.00006

2019

Laukemann J., Hammer J., Hager G., Wellein G.:
Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels
10th IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2019
DOI: 10.1109/PMBS49563.2019.00006
Laukemann J., Hammer J., Hofmann J., Hager G., Wellein G.:
Automated instruction stream throughput prediction for intel and AMD microarchitectures
2018 IEEE/ACM Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, PMBS 2018 (Dallas, TX, 2018-11-12)
In: Proceedings of PMBS 2018: Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, Held in conjunction with SC 2018: The International Conference for High Performance Computing, Networking, Storage and Analysis 2019
DOI: 10.1109/PMBS.2018.8641578

List of activities

2023

OSACA – A Multi-Platform Static Code Analyzer for In-core Performance Prediction
(Speech / Talk)
2023-06-19, Event: 2023 Scalable Tools Workshop, Rice University
URL: https://dyninst.github.io/scalable_tools_workshop/petascale2023/monday.html

2021

OOKAMI A64FX Webinar: LIKWID, OSACA, SpMV
(Speech / Talk)
2021-07-27, Event: OOKAMI HackathonURL: https://moodle.nhr.fau.de/course/view.php?id=8