NHR PerfLab Seminar: From Picojoules to Gigawatt-hours–Energy-to-Completion and GPU DVFS for LLM Workloads (January 20, hybrid)

Image: NHR@FAU

2026-01-12

Topic: From Picojoules to Gigawatt-hours: Energy-to-Completion and GPU DVFS for LLM Workloads

Speaker: Prof. Dr. Holger Fröning (Institute of Computer Engineering (ZITI), Heidelberg University)

Abstract:
Modern LLM workloads span energy scales from picojoules per operation to gigawatt-hours per training run. This talk argues that energy-to-completion (total joules to finish a fixed workload) is the right bridge metric, and that the common “race-to-idle” default is often incomplete for transformer workloads. Using GPU DVFS sweeps on GPT-/LLaMA-style decoder layers, we observe characteristic U-shaped energy-frequency curves whose optima shift systematically with layer shape and sequence length. A Pareto analysis shows that, across representative decoder configurations, ~10–20% energy reductions are often achievable at the cost of roughly similar runtime penalties, while very high boost clocks can be energetically inefficient. To connect micro-level DVFS choices to system-scale outcomes, we then look at large-scale training time modeling: how data/tensor/pipeline parallelism, pipeline bubbles, and communication overhead shape iteration time, and what today’s graph-based simulators can (and cannot yet) predict reliably. We close with open problems toward DVFS-aware LLM runtimes and end-to-end energy optimization.

Image: Tobias Schwerdt

Short bio:
Holger Fröning is a full professor and leads the Hardware and Artificial Intelligence (HAWAII) Lab at the Institute of Computer Engineering at Heidelberg University. His research centers on embedded machine learning and high-performance computing, encompassing hardware and software architectures, programmability, co-design, data movement optimizations, and power and energy considerations. His work adopts a vertically integrated approach, addressing neural architecture optimizations, intermediate layers for compilation, mapping, and instrumentation, and various hardware targets. Current major research projects include accelerating Bayesian Neural Networks, promoting green machine learning by simplifying deep neural network architectures, and exploring emerging computing forms such as analog electrical and aphotonic computing, and resistive memory.

For a list of past and upcoming NHR PerfLab seminar events, please see: https://hpc.fau.de/research/nhr-perflab-seminar-series/