Navigation

Publications, Posters and Talks

Below you also find lists of our posters and talks.

Publications

[2019] [2018] [2017] [2016] [2015] [2014] [2013] [2012] [2011] [2010] [2009][2008] [2007] [2006] [2005]

2019

  • J. Laukemann, J. Hammer, G. Hager, and G. Wellein: Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels.  10th IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS19), Denver, CO, USA. PMBS19 Best Late-Breaking Paper Award. Preprint: arXiv:1910.00214
  • J. Eitzinger, T. Gruber, A. Afzal, T. Zeiser, and G. Wellein: ClusterCockpit – A web application for job-specific performance monitoring. Accepted for HPCMASPA 2019, the Workshop for Monitoring and Analysis for High Performance Compting Systems and Applications, September 23, 2019, Albuquerque, NM, USA. Held in conjunction with IEEE Cluster 2019.
  • C. L. Alappat, G. Hager, O. Schenk, J. Thies, A. Basermann, A. R. Bishop, H. Fehske, and G. Wellein: A Recursive Algebraic Coloring Technique for Hardware-Efficient Symmetric Sparse Matrix-Vector Multiplication. Submitted. Preprint: arXiv:1907.06487
  • M. Bauer, J. Hötzer, D. Ernst, J. Hammer, M. Seiz, H. Hierl, J. Hönig, H. Köstler, G. Wellein, B. Nestler, and U. Rüde: Code Generation for Massively Parallel Phase-Field Simulations. Proc. International Conference for High Performance Computing, Networking, Storage and Analysis (SC19), Denver, CO, November 17-22, 2019. DOI: 10.1145/3295500.3356186
  • J. Hornich, J. Hammer, G. Hager, T. Gruber, and G. Wellein: Collecting and Presenting Reproducible Intranode Stencil Performance: INSPECT. Supercomputing Frontiers and Innovations 6(3), 4-25 (2019). ISSN 2313-8734, DOI: 10.14529/jsfi190301
  • J. Hofmann, C. L. Alappat, G. Hager, D. Fey, and G. Wellein: Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors. Submitted. Preprint: arXiv:1907.00048
  • A. Afzal, G. Hager, and G. Wellein: Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study. Proc. 2019 IEEE International Conference on Cluster Computing (CLUSTER), Albuquerque, NM, September 23-26, 2019. DOI: 10.1109/CLUSTER.2019.8890995, Preprint: arXiv:1905.10603
  • D. Ernst, G. Hager, J. Thies, and G. Wellein: Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs. Accepted for PPAM’2019, the 13th International Conference on Parallel Processing and Applied Mathematics,  September 8-11, 2019, Białystok, Poland. PPAM 2019 Best Paper Award. Preprint: arXiv:1905.03136
  • F. Cremonesi, G. Hager, G. Wellein, and F. Schürmann: Analytic Performance Modeling and Analysis of Detailed Neuron Simulations. Submitted. Preprint: arXiv:1901.05344
  • A. Alvermann, A. Basermann, H.-J. Bungartz, C. Carbogno, D. Ernst, H. Fehske, Y. Futamura, M. Galgon, G. Hager, S. Huber, T. Huckle, A. Ida, A. Imakura, M. Kawai, S. Köcher, M. Kreutzer, P. Kus, B. Lang, H. Lederer, V. Manin, A. Marek,  K. Nakajima, L. Nemec, K. Reuter, M. Rippl, M. Röhrig-Zöllner, T. Sakurai, M. Scheffler, C. Scheurer, F. Shahzad, D. Simoes Brambila, J. Thies, and G. Wellein: Benefits from using mixed precision computations in the ELPA-AEO and ESSEX-II eigensolver projects. Proc. EPASA 2018, Japan Journal of Industrial and Applied Mathematics, 36(2), 699-717, DOI: 10.1007/s13160-019-00360-8. Preprint: arXiv:1806.01036.
  • F. Shahzad, J. Thies, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: CRAFT: A library for easier application-level checkpoint/restart and automatic fault tolerance. IEEE Transactions on Parallel and Distributed Systems 30(3), 501-514 (2019). DOI: 10.1109/TPDS.2018.2866794, Preprint: arXiv:1708.02030

2018

  • J. Schmitt, H. Köstler, J. Eitzinger, and R. Membarth: Unified Code Generation for the Parallel Computation of Pairwise Interactions Using Partial Evaluation. Proc. 17th International Symposium on Parallel and Distributed Computing (ISPDC), Geneva, Switzerland, 2018, pp. 17-24. DOI: 10.1109/ISPDC2018.2018.00012.
  • G. Hager and G. Wellein: Performance Engineering. Informatik Spektrum, ISSN 1432-122X, Online first, DOI: 10.1007/s00287-018-1122-1. (in German)
  • J. Laukemann, J. Hammer, J. Hofmann, G. Hager, and G. Wellein: Automated Instruction Stream Throughput Prediction for Intel and AMD Microarchitectures. 2018 IEEE/ACM Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), Dallas, TX, USA, 2018, pp. 121-131. DOI: 10.1109/PMBS.2018.8641578. Preprint: arXiv:1809.00912
  • M. Wittmann, G. Hager, R. Janalík, M. Lanser, A. Klawonn, O. Rheinbach, O. Schenk, and G. Wellein: Multicore Performance Engineering of Sparse Triangular Solves Using a Modified Roofline Model. Proc. 2018 30th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), September 24-27, 2018, Lyon, France, 233-241. DOI: 10.1109/CAHPC.2018.8645938
  • J. Hofmann, G. Hager, and D. Fey: On the accuracy and usefulness of analytic energy models for contemporary multicore processors. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 22-43. DOI: 10.1007/978-3-319-92040-5_2, Preprint: arXiv:1803.01618. Winner of the ISC 2018 Gauss Award.
  • J. Seiferth, C.L. Alappat, M. Korch, and T. Rauber: Applicability of the ECM Performance Model to Explicit ODE Methods on Current Multi-Core Processors. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 163-183. DOI: 10.1007/978-3-319-92040-5_9.
  • M. Kreutzer, G. Hager, D. Ernst, H. Fehske, A.R. Bishop, and G. Wellein: Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs. In: R. Yokota, M. Weiland, D. Keyes, and C. Trinitis (eds.): High Performance Computing: 33rd International Conference, ISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018, Proceedings, Springer, Cham, LNCS 10876, ISBN 978-3-319-92040-5 (2018), 329-349. DOI: 10.1007/978-3-319-92040-5_17ISC 2018 Hans Meuer Award Finalist.
  • J. Hornich, G. Hager, and C. Pflaum: Efficient optical simulation of nano structures in thin-film solar cells. Proc. SPIE 10694, Computational Optics II, 106940R (28 May 2018); DOI: 10.1117/12.2312545
  • M. Wittmann, V. Haag, T. Zeiser, H. Köstler, and G. Wellein: Lattice Boltzmann Benchmark Kernels as a Testbed for Performance Analysis. Computer & Fluids, Special Issue DSFD2017, (2018). DOI: 10.1016/j.compfluid.2018.03.030. Preprint: arXiv:1711.11468.

2017

  • M. Galgon, L. Krämer, B. Lang, A. Alvermann, H. Fehske, A. Pieper, G. Hager, M. Kreutzer, F. Shahzad, G. Wellein, A. Basermann, M. Röhrig-Zöllner, and J. Thies: Improved coefficients for polynomial filtering in ESSEX. In T. Sakurai, S.-L. Zhang, T. Imamura, Y. Yamamoto, Y. Kuramashi, and T. Hoshi (eds.), Eigenvalue Problems: Algorithms, Software and Applications, in Petascale Computing. Proc. EPASA 2015, Tsukuba, Japan, September 2015, volume 117 of LNCSE, pages 63-79. Springer International Publishing, 2017. DOI: 10.1007/978-3-319-62426-6_5
  • T. Heidig, T. Zeiser, and H. Freund: Influence of resolution of rasterized geometries on porosity and specific surface area exemplified for model geometries of porous media. Transport in Porous Media 120 (1), 207–225 (2017). DOI: 10.1007/s11242-017-0916-y.
  • S. Bauer, M. Mohr, U. Rüde, J. Weismüller, M. Wittmann, and B. Wohlmuth: A two-scale approach for efficient on-the-fly operator assembly in massively parallel high performance multigrid codes. Applied Numerical Mathematics 122 (Supplement C), 14-38 (2017). DOI: 10.1016/j.apnum.2017.07.006.
  • A. Pieper, G. Hager, and H. Fehske: A domain-specific language and matrix-free stencil code for investigating electronic properties of Dirac and topological materials. Submitted. Preprint: arXiv:1708.09689
  • T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: LIKWID Monitoring Stack: A flexible framework enabling job specific performance monitoring for the masses. Accepted for the HPCMASPA 2017, the Workshop on Monitoring and Analysis for High Performance Computing Systems Plus Applications, held in conjunction with IEEE Cluster 2017, Honolulu, HI, September 5, 2017. DOI: 10.1109/CLUSTER.2017.115. Preprint: arXiv:1708.01476
  • T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Multi-dimensional intra-tile parallelization for memory-starved stencil computations. ACM Transactions on Parallel Computing 4(3), 12:1-12:32 (2017). DOI: 10.1145/3155290, Preprint: arXiv:1510.04995
  • J. Hofmann, G. Hager, G. Wellein, and D. Fey: An analysis of core- and chip-level architectural features in four generations of Intel server processors. In: J. Kunkel et al. (eds.), High Performance Computing: 32nd International Conference, ISC High Performance 2017, Frankfurt, Germany, June 18-22, 2017, Proceedings, Springer, Cham, LNCS 10266, ISBN 978-3-319-58667-0 (2017), 294-314. DOI: 10.1007/978-3-319-58667-0_16. Preprint: arXiv:1702.07554
  • J. Hammer, J. Eitzinger, G. Hager, and G. Wellein: Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels. In: Niethammer C., Gracia J., Hilbrich T., Knüpfer A., Resch M., Nagel W. (eds), Tools for High Performance Computing 2016, ISBN 978-3-319-56702-0, 1-22 (2017). Proceedings of IPTW 2016, the 10th International Parallel Tools Workshop, October 4-5, 2016, Stuttgart, Germany. Springer, Cham. DOI: 10.1007/978-3-319-56702-0_1, Preprint: arXiv:1702.04653

2016

  • H. Anzt, J. Dongarra, M. Kreutzer, G. Wellein and M. Köhler: Efficiency of General Krylov Methods on GPUs – An Experimental Study. Proc. 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), Chicago, IL, 683-691 (2016). DOI: 10.1109/IPDPSW.2016.45
  • H. Anzt, M. Kreutzer, E. Ponce, G. D. Peterson, G. Wellein, and J. DongarraOptimization and performance evaluation of the IDR iterative Krylov solver on GPUs. International Journal of High Performance Computing Applications (2016), ISSN: 1094-3420. DOI: 10.1177/1094342016646844
  • T. Röhl, J. Eitzinger, G. Hager, and G. Wellein: Validation of Hardware Events for Successful Performance Pattern Identification in High Performance Computing. In: A. Knüpfer et al. (eds.), Tools for High Performance Computing 2015, Springer International Publishing, ISBN 978-3-319-39589-0 (2016), 17-28. DOI: 10.1007/978-3-319-39589-0_2. Preprint: arXiv:1710.04094
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, and G. Wellein: Building and utilizing fault tolerance support tools for the GASPI applications. International Journal of High Performance Computing Applications (2016). First published date: November-28-2016, DOI: 10.1177/1094342016677085. Preprint (post-review): ft-gaspi-ijhpca.pdf
  • M. Kreutzer, J. Thies, M. Röhrig-Zöllner, A. Pieper, F. Shahzad, M. Galgon, A. Basermann, H. Fehske, G. Hager, and G. Wellein: GHOST: Building blocks for high performance sparse linear algebra on heterogeneous systems. International Journal of Parallel Programming (2016). DOI: 10.1007/s10766-016-0464-z. Preprint: arXiv:1507.08101
  • A. Pieper, M. Kreutzer, A. Alvermann, M. Galgon, H. Fehske, G. Hager, B. Lang, and G. Wellein: High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations. Journal of Computational Physics 325, 226-243 (2016). DOI: 10.1016/j.jcp.2016.08.027, Preprint: arXiv:1510.04895
  • J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Analysis of Intel’s Haswell Microarchitecture Using the ECM Model and Microbenchmarks. Proc. Architecture of Computing Systems — ARCS 2016, Volume 9637 of the series Lecture Notes in Computer Science, 210-222 (2016). DOI: 10.1007/978-3-319-30695-7_16
  • J. Hofmann, D. Fey, M. Riedmann, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multi- and manycore processors. Concurrency Computat.: Pract. Exper., 29: e3921 (2016). DOI: 10.1002/cpe.3921. Preprint: arXiv:1604.01890
  • M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Modeling and analyzing performance for highly optimized propagation steps of the lattice Boltzmann method on sparse lattices. Preprint: arXiv:1410.0412
  • T. M. Malas, J. Hornich, G. Hager, H. Ltaief, C. Pflaum, and D. E. Keyes: Optimization of an electromagnetics code with multicore wavefront diamond blocking and multi-dimensional intra-tile parallelization. Proc. IPDPS16, the 30th IEEE International Parallel & Distributed Processing Symposium, May 23-27, 2016, Chicago, IL. DOI: 10.1109/IPDPS.2016.87. Preprint: arXiv:1510.05218
  • J. Thies, M. Galgon, F. Shahzad, A. Alvermann, M. Kreutzer, A. Pieper, M. Röhrig-Zöllner, A. Basermann, H. Fehske, G. Hager, B. Lang, and G. Wellein: Towards an Exascale Enabled Sparse Solver Repository. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 295-316 (2016). DOI: 10.1007/978-3-319-40528-5_13. Preprint: lncs_CWPs-4.pdf
  • M. Kreutzer, J. Thies, A. Pieper, A. Alvermann, M. Galgon, M. Röhrig-Zöllner, F. Shahzad, A. Basermann, A. R. Bishop, H. Fehske, G. Hager, B. Lang, and G. Wellein: Performance Engineering and Energy Efficiency of Building Blocks for Large, Sparse Eigenvalue Computations on Heterogeneous Supercomputers. In: Software for Exascale Computing – SPPEXA 2013-2015, Volume 113 of the series Lecture Notes in Computational Science and Engineering, 317-338 (2016). DOI: 10.1007/978-3-319-40528-5_14

2015

  • B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth: Towards Textbook Efficiency for Parallel Multigrid. Numerical Mathematics-Theory Methods and Applications 8 (2015), p. 22-46, ISSN: 1004-8979, DOI: 10.4208/nmtma.2015.w10si
  • B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth: Performance and Scalability of Hierarchical Hybrid Multigrid Solvers for Stokes Systems. SIAM Journal on Scientific Computing 37 (2015), p. C143-C168, ISSN: 1064-8275, DOI: 10.1137/130941353
  • C. Feichtinger, J. Habich, H. Köstler, U. Rüde, and T. Aoki: Performance modeling and analysis of heterogeneous lattice Boltzmann simulations on CPU-GPU clusters. Parallel Computing 46, 1-13 (2015). DOI: 10.1016/j.parco.2014.12.003
  • J. Hammer, G. Hager, J. Eitzinger, and G. Wellein: Automatic Loop Kernel Analysis and Performance Modeling With Kerncraft. Proc. PMBS15, the 6th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems, in conjunction with ACM/IEEE Supercomputing 2015 (SC15), November 16, 2015, Austin, TX. DOI: 10.1145/2832087.2832092, Preprint: arXiv:1509.03778
  • J. Hofmann, D. Fey, J. Eitzinger, G. Hager, and G. Wellein: Performance analysis of the Kahan-enhanced scalar product on current multicore processors. In: R. Wyrzykowski et al. (eds.), Parallel Processing and Applied Mathematics: 11th International Conference, PPAM 2015, Krakow, Poland, September 6-9, 2015. Revised Selected Papers, Part I. LNCS vol. 9573, 63-73 (2016). DOI: 10.1007/978-3-319-32149-3_7 Preprint: arXiv:1505.02586
  • F. Shahzad, M. Kreutzer, T. Zeiser, R. Machado, A. Pieper, G. Hager, G. Wellein: Building a fault tolerant application using the GASPI communication layer. Proc. FTS 2015, the 1st International Workshop on Fault-Tolerant Systems, in conjunction with IEEE Cluster 2015, September 8, 2015, Chicago, IL. DOI: 10.1109/CLUSTER.2015.106, Preprint: arXiv:1505.04628
  • T. M. Malas, G. Hager, H. Ltaief, H. Stengel, G. Wellein, and D. E. Keyes: Multicore-optimized wavefront diamond blocking for optimizing stencil updates. SIAM Journal on Scientific Computing 37(4), C439-C464 (2015). DOI: 10.1137/140991133, Preprint: arXiv:1410.3060
  • M. Röhrig-Zöllner, J. Thies, M. Kreutzer, A. Alvermann, A. Pieper, A. Basermann, G. Hager, G. Wellein, and H. Fehske: Increasing the performance of the Jacobi-Davidson method by blocking. SIAM Journal on Scientific Computing, 37(6), C697–C722 (2015). DOI: 10.1137/140976017, Preprint: http://elib.dlr.de/89980/
  • H. Stengel, J. Treibig, G. Hager, and G. Wellein: Quantifying performance bottlenecks of stencil computations using the Execution-Cache-Memory model. Proc. ICS15, the 29th International Conference on Supercomputing, June 8-11, 2015, Newport Beach, CA. DOI: 10.1145/2751205.2751240. Preprint: arXiv:1410.5010
  • H. Fehske, G. Hager, and A. Pieper: Electron confinement in graphene with gate-defined quantum dots. Phys. Status Solidi B, 252: 1868–1871 (2015). DOI: 10.1002/pssb.201552119. Preprint: arXiv:1503.05815
  • M. Wittmann, G. Hager, T. Zeiser, J. Treibig, and G. Wellein: Chip-level and multi-node analysis of energy-optimized lattice-Boltzmann CFD simulations. Concurrency and Computation: Practice and Experience 28(7), 2295-2315 (2015). DOI: 10.1002/cpe.3489 Preprint: arXiv:1304.7664
  • M. Kreutzer, G. Hager, G. Wellein, A. Pieper, A. Alvermann, and H. Fehske: Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. Proc. IPDPS15, the 29th IEEE International Parallel & Distributed Processing Symposium, May 25-29, 2015, Hyderabad, India. DOI: 10.1109/IPDPS.2015.76, Preprint: arXiv:1410.5242

2014

  • R. Schöne, J. Treibig, M.F. Dolz, C. Guillen, C. Navarrete, M. Knobloch, and B. Rountree: Tools and methods for measuring and tuning the energy efficiency of HPC systems. Scientific Programming 22(4), 273-283 (2014). DOI: 10.3233/SPR-140393
  • T. Röhl, J. Treibig, G. Hager, and G. Wellein: Overhead Analysis of Performance Counter Measurements. In: Proc. PSTI 2014, the Fifth International Workshop on Parallel Software Tools and Tool Infrastructures, Sept 11, 2014, Minneapolis, MN. DOI: 10.1109/ICPPW.2014.34
  • T. M. Malas, G. Hager, H. Ltaief, and D. E. Keyes: Towards energy efficiency and maximum computational intensity for stencil algorithms using wavefront diamond temporal blocking. Preprint: arXiv:1410.5561
  • A. Alvermann, A. Basermann, H. Fehske, Martin Galgon, G. Hager, M. Kreutzer, L. Krämer, B. Lang, A. Pieper, M. Röhrig-Zöllner, F. Shahzad, J. Thies, and G. Wellein: ESSEX: Equipping Sparse Solvers for Exascale. In: L. Lopes et al. (Eds.): Euro-Par 2014 Workshops, Part II, LNCS 8806, 577-588 (2014). DOI: 10.1007/978-3-319-14313-2_49. Preprint
  • M. Kreutzer, G. Hager, G. Wellein, H. Fehske, and A. R. Bishop: A unified sparse matrix data format for efficient general sparse matrix-vector multiplication on modern processors with wide SIMD units. SIAM Journal on Scientific Computing 36(5), C401–C423 (2014). DOI: 10.1137/130930352, Preprint: arXiv:1307.6209, BibTeX
  • J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Comparing the Performance of Different x86 SIMD Instruction Sets for a Medical Imaging Application on Modern Multi- and Manycore Chips. Accepted for WPMVP 2014, the Workshop on Programming Models for SIMD/Vector Processing at PPoPP 2014, Orlando, FL, Feb 16, 2014. DOI: 10.1145/2568058.2568068, Preprint: arXiv:1401.7494
  • J. Hofmann, J. Treibig, G. Hager, and G. Wellein: Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator. Accepted for PASA 2014, the 11th Workshop on Parallel Algorithms and Systems and Algorithms, Lübeck, Germany, Feb 25-26, 2014. IEEE Archive, Preprint: arXiv:1401.3615
  • S. Kronawitter, H. Stengel, G. Hager, and C. Lengauer: Domain-Specific Optimization of Two Jacobi Smoother Kernels and Their Evaluation in the ECM Performance Model. Parallel Processing Letters 24, 1441004 (2014). DOI: 10.1142/S0129626414410047
  • G. Hager, J. Treibig, J. Habich, and G. Wellein: Exploring performance and power properties of modern multicore chips via simple machine models. Concurrency and Computation: Practice and Experience 28(2), 189-210 (2016). First published online December 2013, DOI: 10.1002/cpe.3180, Preprint: arXiv:1208.2908

2013

  • M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: Domain decomposition and locality optimization for large-scale lattice Boltzmann simulations. Computers & Fluids 80 (2013), 283-289. DOI: 10.1016/j.compfluid.2012.02.007. Preprint: arXiv 1111.1129 (2011).
  • M. Wittmann, G. Hager, G. Wellein, T. Zeiser, and B. Krammer: MPC and Coarray Fortran: Alternatives to Classic MPI Implementations on the Examples of Scalable Lattice Boltzmann Flow Solvers. In: W. E. Nagel et al. (eds.), High Performance Computing in Science and Engineering ‘12, Springer, ISBN 978-3-642-33373-6 (2013) 367-372. DOI: 10.1007/978-3-642-33374-3_27
  • C. Scheit, G. Hager, J. Treibig, S. Becker, and G. Wellein: Optimization of FASTEST-3D for Modern Multicore Systems. Preprint: arXiv:1303.4538
  • T. Scharpff, K. Iglberger, G. Hager, and U. Rüde: Model-guided Performance Analysis of the Sparse Matrix-Matrix Multiplication. Proc. 2013 International Conference on High Performance Computing & Simulation (HPCS 2013), July 1-5, 2013, Helsinki, Finland. DOI: 10.1109/HPCSim.2013.6641452, Preprint: arXiv:1303.1651
  • M. Wittmann, G. Hager, T. Zeiser, and G. Wellein: Asynchronous MPI for the Masses. Preprint: arXiv:1302.4280
  • F. Shahzad, M. Wittmann, T. Zeiser, G. Hager, and G. Wellein: An Evaluation of Different IO Techniques for Checkpoint/Restart. Workshop on Large-Scale Parallel Processing 2013 (LSPP13). DOI: 10.1109/IPDPSW.2013.145, Preprint: asyn_ckpt_130115.pdf
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: A survey of checkpoint/restart techniques on distributed memory systems. Parallel Processing Letters 23(04), 1340011-1340030 (2013). DOI: 10.1142/S0129626413400112
  • F. Shahzad, M. Wittmann, M. Kreutzer, T. Zeiser, G. Hager, and G. Wellein: PGAS implementation of SpMVM and LBM with GPI. Proceedings of the 7th International Conference on PGAS Programming Models, Oct. 3-4, 2013, Edinburgh, Scotland, 172-184 (2013).

2012

2011

  • G. Schubert, H. Fehske, G. Hager, and G. Wellein: Hybrid-parallel sparse matrix-vector multiplication with explicit communication overlap on current multicore-based systems. Parallel Processing Letters 21(3), 339-358 (2011). DOI: 10.1142/S0129626411000254, Preprint: arXiv:1106.5908
  • G. Hager, G. Schubert, T. Schoenemeyer, and G. Wellein: Prospects for Truly Asynchronous Communication with Pure MPI and Hybrid MPI/OpenMP on Current Supercomputing Platforms. Proc. Cray Users Group Conference 2011 (CUG 2011), May 23-26, 2011, Fairbanks, AK. Hager-Paper-CUG11.pdf
  • J. Treibig, G. Hager, and G. Wellein: LIKWID performance tools. In: C. Bischof et al. (eds.), Competence in High Performance Computing 2010. Springer, ISBN 978-3-642-24025-6 (2012), 165-175. DOI: 10.1007/978-3-642-24025-6_14, Preprint: arXiv:1104.4874
  • G. Schubert, G. Hager, H. Fehske and G. Wellein: Parallel sparse matrix-vector multiplication as a test case for hybrid MPI+OpenMP programming. Workshop on Large-Scale Parallel Processing (LSPP 2011), May 20th, 2011, Anchorage, AK. DOI:10.1109/IPDPS.2011.332, Preprint: arXiv:1101.0091
  • J. Treibig, G. Wellein and G. Hager: Efficient multicore-aware parallelization strategies for iterative stencil computations. Journal of Computational Science 2, 130-137 (2011). DOI: 10.1016/j.jocs.2011.01.010, Preprint: arXiv:1004.1741

2010

  • M. Wittmann and G. Hager: Optimizing ccNUMA locality for task-parallel execution under OpenMP and TBB on multicore-based systems. Preprint: arXiv:1101.0093
  • G. Hager and G. Wellein: Introduction to High Performance Computing for Scientists and Engineers. CRC Press, ISBN 978-1439811924, 356 pages, July 2010. Available as eBook.
  • C. Feichtinger, J. Habich, H. Köstler, G. Hager, U. Rüde and G.Wellein: A Flexible Patch-Based Lattice Boltzmann Parallelization Approach for Heterogeneous GPU-CPU Clusters. Parallel Computing 37(9), 536-549 (2011) . DOI: 10.1016/j.parco.2011.03.005. Preprint: arXiv:1007.1388
  • M. Wittmann, G. Hager, J. Treibig and G. Wellein: Leveraging shared caches for parallel temporal blocking of stencil codes on multicore processors and clusters. Parallel Processing Letters 20 (4), 359-376 (2010). DOI: 10.1142/S0129626410000296 Preprint: arXiv:1006.3148
  • H. Fehske and G. Hager: Luttinger, Peierls or Mott? Quantum Phase Transitions in Strongly Correlated 1D Electron-Phonon Systems. In: F. Hensel, P. Edwards and R. Redmer (Eds.), Metal-to-Nonmetal Transitions. Springer Series in Material Sciences, Vol. 132, (Springer) 1-22, 2010. DOI: 10.1007/978-3-642-03953-9_1
  • J. Treibig, G. Hager, M. Meier and G. Wellein: LIKWID performance tools. InSiDE 8(1), 50-53 (Spring 2010).
  • J. Treibig, G. Hager and G. Wellein: LIKWID: A lightweight performance-oriented tool suite for x86 multicore environments. Proceedings of PSTI2010, the First International Workshop on Parallel Software Tools and Tool Infrastructures, San Diego CA, September 13, 2010. DOI: 10.1109/ICPPW.2010.38 Preprint: arXiv:1004.4431
  • J. Treibig, G. Hager and G. Wellein: Complexities of Performance Prediction for Bandwidth-Limited Loop Kernels on Multi-Core Architectures. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 3-12. DOI: 10.1007/978-3-642-13872-0_1, Preprint (Multi-core architectures: Complexities of performance prediction and the impact of cache topology): arXiv:0910.4865.
  • G. Schubert, G. Hager and H. Fehske: Performance limitations for sparse matrix-vector multiplications on current multicore environments. In: S. Wagner et al., High Performance Computing in Science and Engineering, Garching/Munich 2009. Springer, ISBN 978-3642138713 (2010), 13-26. DOI: 10.1007/978-3-642-13872-0_2, Preprint: arXiv:0910.4836.
  • M. Wittmann, G. Hager and G. Wellein: Multicore-aware parallel temporal blocking of stencil codes for shared and distributed memory. Workshop on Large-Scale Parallel Processing at IPDPS 2010, April 23rd, 2010, Atlanta, GA.Preprint: arXiv:0912.4506, DOI: 10.1109/IPDPSW.2010.5470813
  • J. Habich, T. Zeiser, G. Hager, G. Wellein: Performance analysis and optimization strategies for a D3Q19 Lattice Boltzmann Kernel on nVIDIA GPUs using CUDA. Advances in Engineering Software 42 (5), 266-272 (2011). DOI: 10.1016/j.advengsoft.2010.10.007

2009

  • T. Zeiser, G. Hager and G. Wellein: Benchmark analysis and application results for lattice Boltzmann simulations on NEC SX vector and Intel Nehalem systems. Parallel Processing Letters 19 (4), 491-511 (2009) DOI:10.1142/S0129626409000389
  • J. Treibig and G. Hager: Introducing a Performance Model for Bandwidth-Limited Loop Kernels. Proceedings of the Workshop “Memory issues on Multi- and Manycore Platforms” at PPAM 2009, the 8th International Conference on Parallel Processing and Applied Mathematics, Wroclaw, Poland, September 13-16, 2009. Lecture Notes in Computer Science Volume 6067, 2010, pp 615-624. DOI: 10.1007/978-3-642-14390-8_64. arXiv:0905.0792
  • T. Zeiser, G. Hager and G. Wellein: The world’s fastest CPU and SMP node: Some performance results from the NEC SX-9. Proceedings of LSPP 2009 at IPDPS09, Rome, Italy, May 25-29, 2009. DOI:10.1109/IPDPS.2009.5161089
  • G. Hager, G. Jost, and R. Rabenseifner: Communication Characteristics and Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-core SMP Nodes. In: Proceedings of the Cray Users Group Conference 2009 (CUG 2009), Atlanta, GA, USA, May 4-7, 2009. cug09_hager_jost_rabenseifner.pdf
  • G. Wellein, G. Hager, T. Zeiser, M. Wittmann, and H. Fehske: Efficient temporal blocking for stencil computations by multicore-aware wavefront parallelization. Proceedings of COMPSAC 2009, the 33rd Annual IEEE International Computer Software and Applications Conference, Seattle, WA, July 20-24, 2009. DOI:10.1109/COMPSAC.2009.82
  • J. Habich, T. Zeiser, G. Hager, and G. Wellein: Speeding up a Lattice Boltzmann Kernel on nVIDIA GPUs. Proceedings of PARENG09-S01, the First International Conference on Parallel, Distributed and Grid Computing for Engineering, Pecs, Hungary, April 2009. DOI:10.4203/ccp.90.17
  • M. Wittmann and G. Hager: A Proof of Concept for Optimizing Task Parallelism by Locality Queues. arXiv:0902.1884
  • R. Rabenseifner, G. Hager, and G. Jost: Hybrid MPI/OpenMP Parallel Programming on Clusters of Multi-Core SMP Nodes. In Didier El Baz et al. (Eds.), Proceedings of the 17th Euromicro International Conference on Parallel, Distributed, and network-based Processing PDP 2009, Feb 18-20, 2009, Weimar, Germany. Computer Society Press, pp. 427-436. DOI:10.1109/PDP.2009.43 hjr.pdf
  • S. Ejima, G. Hager, and H. Fehske: Quantum phase transition in a 1D transport model with boson affected hopping: Luttinger liquid versus charge-density-wave behavior. Phys. Rev. Lett. 102, 106404 (2009), DOI: 10.1103/PhysRevLett.102.106404, arXiv:0811.0742
  • T. Zeiser, G. Hager, and G. Wellein: Vector computers in a world of commodity clusters, massively parallel systems and many-core many-threaded CPUs: recent experience based on advanced lattice Boltzmann flow solvers. In: W. E. Nagel, D. B. Kröner, M. Resch (eds.), High Performance Computing in Science and Engineering ’08, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2008, Springer, ISBN 978-3-540-88301-2, (2009) 333-347. DOI:10.1007/978-3-540-88303-6.

2008

  • N. Schindzielorz, J. Erler, P. Klüpfel, P.-G. Reinhard, and G. Hager: Fission of super-heavy nuclei explored with Skyrme forces. Int. J. Mod. Phys. E 18(4), 773-781 (2009). DOI:10.1142/S0218301309012860
  • M. Breuer, P. Lammers, T. Zeiser, G. Hager, and G. Wellein: Towards the simulation of the turbulent flow over dimples – Code evaluation and optimization for the NEC SX-8. In: W.E. Nagel, D. Körner, M. Resch (eds.), High Performance Computing in Science and Engineering ’07, Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2007, Springer, ISBN 978-3-540-74739-0 / 978-3-540-74738-3, (2008) 303-318. doi:10.1007/978-3-540-74739-0_21.
  • H. Fehske, G. Hager and J. Jeckelmann: Metallicity in the half-filled Holstein-Hubbard model. Europhys. Lett. 84, 57001 (2008), DOI:10.1209/0295-5075/84/57001, arXiv:0808.1675
  • G. Hager, T. Zeiser and G. Wellein: Data access optimizations for highly threaded multi-core CPUs with multiple memory controllers. Workshop on Large-Scale Parallel Processing 2008, DOI:10.1109/IPDPS.2008.4536341, arXiv:0712.2302
  • G. Hager, T. Zeiser and G. Wellein: Data access characteristics and optimizations for Sun UltraSPARC T2 and T2+ systems. Parallel Processing Letters, Vol. 18, No. 4 (2008) 471-490. DOI:10.1142/S0129626408003521 Preprint: ppl-hzw.pdf

2007

  • G. Hager, A. Weiße, G. Wellein, E. Jeckelmann and H. Fehske: The spin-Peierls chain revisited. J. Magn. Magn. Mater. 310, 1380-1382 (2007). Erratum: J. Magn. Magn. Mater. 316, 43 (2007). Proceedings of the 17th International Conference on Magnetism (ICM 2006), Aug 20-25 2006, Kyoto, Japan. arXiv:cond-mat/0606360
  • M. Hohenadler, G. Hager, G. Wellein and H. Fehske: Carrier-density effects in many-polaron systems. J. Phys.: Condens. Matter 19 (2007) 255202. arXiv:cond-mat/0609296
  • T. Zeiser, G. Wellein, A. Nitsure, K. Iglberger, U. Rüde and G. Hager: Introducing a parallel cache oblivious blocking approach for the lattice Boltzmann method. Progress in Computational Fluid Dynamics, An Int. J. Vol. 8, No.1/2/3/4 (2008) 179-188. Proceedings of ICMMES 2006. DOI:10.1504/PCFD.2008.018088
  • G. Hager and G. Wellein: Architectures and Performance Characteristics of Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 681-730 (2008), ISBN: 978-3-540-74685-0
  • G. Hager and G. Wellein: Optimization Techniques for Modern High Performance Computers. In Fehske et al., Lect. Notes Phys. 739, 731-767 (2008), ISBN: 978-3-540-74685-0
  • G. Hager, H. Stengel, T. Zeiser and G. Wellein: RZBENCH: Performance evaluation of current HPC architectures using low-level and application benchmarks. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 485-501. arXiv:0712.3389
  • M. Stürmer, G. Wellein, G. Hager, H. Köstler and Ulrich Rüde: Challenges and potentials of emerging multicore architectures. In: S. Wagner et al. (Eds.), High Performance Computing in Science and Engineering, Garching/Munich 2007. Transactions of the Third Joint HLRB and KONWIHR Status and Result Workshop, Dec 3-4, 2007, LRZ Garching, Springer, ISBN 978-3-540-69181-5 (2009) 551-566.

2006

  • G. Wellein, P. Lammers, G. Hager, S. Donath and T. Zeiser: Towards optimal performance for lattice Boltzmann applications on terascale computers. In: A. Deane et al. (eds), Parallel Computational Fluid Dynamics – Theory and Applications. Proceedings of the Parallel CFD 2005 Conference, College Park, MD, USA, May 24-27, 2005. Elsevier, ISBN 0-444-52206-9 (2006) 31-40.
  • H. Fehske, G. Hager, G. Wellein and E. Jeckelmann: Hole-doped Hubbard ladders. Physica B 378-380, 319-320 (2006). arXiv:cond-mat/0505666
  • G. Schubert, A. Alvermann, A. Weiße, G. Hager, G. Wellein and H. Fehske: Spectral Properties of Strongly Correlated Electron Phonon Systems. NIC Symposium 2006, G. Münster, D. Wolf, M. Kremer (Editors), John von Neumann Institute for Computing, Jülich, NIC Series, Vol. 32, ISBN 3-00-017351-X, pp. 201-210, 2006.
  • A. Weiße, G. Hager, A. R. Bishop and H. Fehske: Phase diagram of the spin-Peierls chain with local coupling. Phys. Rev. B 74, 214426 (2006). arXiv:cond-mat/0607209
  • A. Nitsure, K. Iglberger, U. Rüde, C. Feichtinger, G. Wellein, G. Hager: Optimization of Cache Oblivious Lattice Boltzmann Method in 2D and 3D. In: Becker, Matthias; Szczerbicka, Helena (Hrsg.): Simulationstechnique – 19th Symposium in Hannover, September 2006 (ASIM 2006 – 19. Symposium Simulationstechnik, Hannover, 12. – 14. 09. 2006). Erlangen, SCS Publishing House, 2006, S. 265-270 (Frontiers in Simulation, Vol. 16)
  • P. Lammers, G. Wellein, T. Zeiser, G. Hager, M. Breuer: Have the vectors the continuing ability to parry the attack of the killer micros? In: M. Resch, T. Bönisch, K. Benkert, T. Furui, Y. Seo, W. Bez (editors): High Performance Computing on Vector Systems. Proceedings of the High Performance Computing Center Stuttgart, March 2005), Springer, ISBN 3-540-29124-5, (2006) 25-39. doi:10.1007/3-540-35074-8_2.

2005

  • G. Hager: A parallelized density matrix renormalization group algorithm and its application to strongly correlated quantum systems. Dissertation, Ernst-Moritz-Arndt-Universität Greifswald, 2005. URN: urn:nbn:de:gbv:9-000024-1
  • G. Hager, T. Zeiser and H. Heller:Setting up ByGRID – First Steps Towards an e-Science Infrastructure in Bavaria. In: A. Bode, F. Durst (Eds.): High Performance Computing in Science and Engineering, Garching 2005. Transactions of the KONWIHR Result Workshop, October 14-15, 2004 2, Technical University of Munich, Garching, Springer, ISBN 3-540-26145-1 (2005) 97-102.
  • G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: Stripe formation in doped Hubbard ladders. Phys. Rev. B 71, 075108 (2005). arXiv:cond-mat/0409321
  • H. Fehske, G. Wellein, G. Hager, A. Weiße, K.W. Becker and A.R. Bishop: Luttinger liquid versus charge density wave behaviour in the one-dimensional spinless fermion Holstein model. Physica B 359-361, 699-701 (2005). arXiv:cond-mat/0406023
  • G. Hager, T. Zeiser, J. Treibig and G. Wellein: Optimizing performance on modern HPC systems: learning from simple kernel benchmarks. In: Proceedings of the 2nd Russian-German Advanced Research Workshop on Computational Science and High Performance Computing, HLRS, Stuttgart, March 14 – 16, 2005.
  • G. Wellein, T. Zeiser, S. Donath and G. Hager: On the Single Processor Performance of Simple Lattice Boltzmann Kernels. Proc. ICMMES, 2004. Computers & Fluids 35, 910-919 (2006). DOI:10.1016/j.compfluid.2005.02.008
  • S. Donath, T. Zeiser, G. Hager, J. Habich and G. Wellein: Optimizing Performance of the Lattice Boltzmann Method for Complex Structures on Cache-based Architectures. In: F. Huelsemann, M. Kowarschik, U. Ruede (Eds.): Frontiers in Simulation: Simulation Techniques – 18th Symposium in Erlangen, September 2005 (ASIM), pp. 728-735, SCS Publishing House, Erlangen, 2005.
  • G. Hager, B. Bergen, P. Lammers and G. Wellein: Taming the Bandwidth Behemoth – First Experiences on a Large SGI Altix System.InSiDE 3, No. 2, Autumn 2005, pp. 24-25 (2005).

2004

  • G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Parallelization Strategies for Density Matrix Renormalization Group Algorithms on Shared-Memory Systems. J. Comput. Phys. 194(2), 795 (2004). arXiv:cond-mat/0305463
  • H. Fehske, G. Wellein, G. Hager, A. Weiße and A. R. Bishop: Quantum Lattice Dynamical Effects on Single-Particle Excitations in One-dimensional Mott and Peierls Insulators. Phys. Rev. B 69, 165115 (2004). arXiv:cond-mat/0312426
  • G. Hager, G. Wellein, E. Jeckelmann and H. Fehske: DMRG Investigation of Stripe Formation in Doped Hubbard Ladders. In: A. Bode (Ed.): High Performance Computing in Science and Engineering 2004 – Transactions of the Second Joint HLRB and KONWIHR Result and Reviewing Workshop (Second Joint HLRB and KONWIHR Result and Reviewing Workshop Munich – Germany 2-3 March 2004). Berlin: Springer, 2004.
  • G. Hager, E. Jeckelmann, H. Fehske and G. Wellein: Exact Numerical Treatment of Finite Quantum Systems using Leading-Edge Supercomputers. In: Modelling, Simulation and Optimization of Complex Processes, Eds. H. G. Bock, E. Kostina, H.-X. Phu, R. Rannacher, Springer-Verlag Berlin Heidelberg (2005), pp 165-175.
  • G. Wellein, T. Zeiser, G. Hager and P. Lammers: Application Performance of Modern Number Crunchers. CSAR Focus, Ed. 12, Summer-Autumn 2004, pp. 17-19 (2004).

2003

  • G. Wellein, G. Hager, A. Basermann and H. Fehske: Fast sparse matrix-vector multiplication for TFlop/s computers.In: J.M.L.M. Palma; J. Dongarra (Hrsg.) : High Performance Computing for Computational Science – VECPAR2002 (High Performance Computing for Computational Science – VECPAR2002 Porto – Portugal 26-28 June 2002). Berlin : Springer, 2003.
  • H. Fehske, G. Wellein, A. P. Kampf, M. Sekania, G. Hager, A. Weiße, H. Büttner and A. R. Bishop: One-dimensional electron-phonon systems: Mott- versus Peierls-insulators. In: A. Bode (Hrsg.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
  • G. Hager, F. Deserno and G. Wellein: Pseudo-Vectorization and RISC Optimization Techniques for the Hitachi SR8000 architecture. In: A. Bode (Ed.) : High Performance Computing in Science and Engineering 2002 – Transactions of the First Joint HLRB and KONWIHR Result and Reviewing Workshop (First Joint HLRB and KONWIHR Result and Reviewing Workshop Garching – Germany 10-11 October 2002). Berlin : Springer, 2003.
  • G. Hager, F. Brechtefeld, P. Lammers and G. Wellein: Processor Architecture and Application Performance in Modern Supercomputers.InSiDE 1, No. 1, Spring 2003, pp. 8-13 (2003).

2001

  • G. Wellein, G. Hager, A. Basermann and H. Fehske: Exact Diagonalization of Large Sparse Matrices: A Challenge for Modern Supercomputers. In: Proceedings of CRAY Users Group (CUG) Summit 2001 (CUG Summit 2001 Indian Wells – USA May 2001). 2001, S. CD-ROM.

Posters

2019

  • T. Gruber, J. Eitzinger, G. Hager, and G. Wellein: LIKWID 5: Lightweight Performance Tools. Poster at SC19.
  • J. Hammer, J. Hornich, G. Hager, T. Gruber, and G. Wellein: INSPECT Intranode Stencil Performance Evaluation Collection. Poster at SC19.
  • A. Afzal, G. Hager, and G. Wellein: Delay Flow Mechanisms on Clusters. Poster at EuroMPI 2019EuroMPI2019_AHW-Poster.pdf EuroMPI2019-AHW-Summary.pdf

2018

2016

2015

2014

2013

2011

  • J. Treibig, G. Hager, M. Meier, and G. Wellein: LIKWID Performance Tools. Poster at SC11.

Talks

[2019] [2018] [2018] [2017] [2016] [2015]

2019

  • J. Laukemann: Automatic Throughput and Critical Path Analysis of x86 and ARM Assembly Kernels. 10th IEEE International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS19), Denver, CO, November 18, 2019 (co-located with SC19).
  • J. Eitzinger: Software-Eigenentwicklungen am RRZE – Ein Erfahrungsbericht. ZKI AK Supercomputing, FU Berlin, September 26-27, 2019.
  • J. Eitzinger: KONWIHR – Kompetenznetzwerk für wissenschaftliches Höchstleistungsrechnen in Bayern. ZKI AK Supercomputing, FU Berlin, September 26-27, 2019.
  • A. Afzal: Propagation and Decay of Injected One-Off Delays on Clusters: A Case Study. Paper presentation at IEEE Cluster 2019, Albuquerque, NM, September 25, 2019.
  • D. Ernst: Performance Engineering for a Tall & Skinny Matrix Multiplication Kernel on GPUs. Paper presentation at PPAM 2019, Bialystok, Poland, September 10, 2019.
  • C. L. Alappat: First results for performance modeling on ARM CPUs2nd ARM HPC Workshop, Shanghai, China, July 12, 2019.
  • C. L. Alappat: Recursive Algebraic Coloring Engine. Lawrence Berkeley National Laboratory (LBNL), Berkeley CA, June 14, 2019.
  • C. L. Alappat: Recursive Algebraic Coloring Engine. Georgia Institute of Technology, Atlanta, GA, June 18, 2019.
  • G. Hager: Von der Wettervorhersage zur Kernwaffe: Supercomputer – was sie sind und was sie können. Night of Science, Universität Frankfurt, 14. Juni 2019.

2018

  • T. Gruber: LIKWID – Detecting performance limiting factors with hardware monitoring. Talk at aiXcelerate 2018 (HPC Tuning Workshop) at IT Center of RWTH Aachen University, Aachen, Germany, December 4, 2018. LIKWID.pdf
  • T. Gruber: Single node optimization. Lecture at International HPC Summer School 2018, IT4Innovations National Supercomputing Center Ostrava, Czech Republic, July 12, 2018.
  • T. Gruber: The Performance Addict’s ToolboxTalk at Heise Parallel 2018 conference, Heidelberg, Germany, March 8, 2018.
  • G. Hager: Making sense of performance numbers. Invited talk at OpenMPCon 2018, Barcelona, Spain, September 24-26, 2018.
  • G. Hager: Thirteen modern ways to fool the masses with performance results on parallel computersGridKa School 2018, Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany, August 29, 2018.
  • C. Alappat: RACE: Recursive Algebraic Coloring EnginePASC MS23, Basel, Switzerland, July 2-4, 2018.
  • G. Hager: Performance Engineering – Why and How? PASC MS05, Basel, Switzerland, July 2-4, 2018.  PASC18_MS05_Hager.pdf
  • G. Wellein: Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUsISC High Performance 2018, Frankfurt, Germany, June 24-28, 2018.
  • T. Köster: Porting Physical Parameterizations from a Climate Model to AcceleratorsPASC MS25, Basel, Switzerland, July 2-4, 2018.
  • J. Eitzinger: Der unstillbare Hunger nach Rechenleistung: 25 Jahre High Performance Computing am RRZE, Video Recording,  Campustreffen Sommersemester 2018, RRZE Erlangen, June 7, 2018
  • G. Hager: Von der Wettervorhersage zur Kernwaffe: Supercomputer – was sie sind und was sie können. Night of Science, Universität Frankfurt, 8. Juni 2018.
  • G. Wellein: “Ja wie schnell laufen sie denn?!” Performance Engineering fürs Höchstleistungsrechnen. Feierliche Inbetriebnahme des LiDo3 – TU Dortmund,16. Mai 2018.
  • J. Eitzinger: ProPE: Node Level Performance Engineering and Performance Patterns, Workshop: Parallele Programmierung in Computational Engineering and Science, RWTH Aachen, March 15, 2018
  • G. Hager: “If it doesn’t work, we learn something.” Instructive case studies from performance engineering. Minisymposium MS29 at SIAM PP18, the 2018 Conference on Parallel Processing, March 8, 2018, Tokyo, Japan.
  • G. Wellein: Performance Engineering for Sparse Linear Algebra Kernels: Navigating Between Models and Expectations. Minisymposium MS85 at SIAM PP18, the 2018 Conference on Parallel Processing, March 9, 2018, Tokyo, Japan.
  • J. Hammer: Cache-aware Scheduling and Performance Modeling with LLVM-Polly and Kerncraft. Second LLVM Performance Workshop at CGO, Saturday February 24th, 2018,  Vienna, Austria.
  • J. Eitzinger: SIMD – past, present and future, Keynote talk, WPMVP Workshop at PPoPP 2018, February 24th, 2018,  Vienna, Austria.

2017

  • T. Gruber: LIKWID Monitoring Stack – A flexible framework enabling job specific performance monitoring for the masses. Talk at HPCMASPA Workshop held in conjunction with IEEE Cluster 2017, Honolulu, Hawaii, USA, September 5, 2017.
  • T. Gruber: LIKWID and performance monitoring with ECM. Talk at Lawrence Berkeley National Laboratory, Berkeley, USA, August 15, 2017.
  • T. Gruber: Performance Analysis with LIKWID. Talk at “Parallel Programming in Computational Engineering and Science 2017” at IT Center of RWTH Aachen University, Aachen, Germany, March 21, 2017.
  • G. Wellein: Performance Engineering for Scalable Sparse Eigensolvers in the DFG Project ESSEX: From basic building blocks to full scale applications. JST/CREST International Symposium on Post Petascale System Software, December 12th, 2017, Tokyo, Japan.
  • G. Wellein: Performance Engineering: Welcome to the world of FLOPs, Bytes and Cycles!. 34th ASE Seminar, December 13th, 2017, The University of Tokyo, Tokyo, Japan.
  • J. Eitzinger: Components for practical performance engineering in a computing center environment: The ProPE project, 7. GA Status Konferenz, HLRS Stuttgart, December 4, 2017.
  • J. Eitzinger: Eine kurze Einführung in Rechnerarchitektur und Programmierung von Hochleistungsrechnern als zentrales Werkzeug in der SimulationVideo recording,  Collegium Alexandrinum, Erlangen,  November 30, 2017
  • G. Wellein: Performance Engineering for HPC: Models Generating Insights.
    • Austrian HPC Meeting 2017 (AHPC17), March 1-3, 2017, Grundlsee (Austria)
    • Seminar, Faculty of Informatics, USI Lugano, March 29, 2017, Lugano (Switerzland)
    • Invited Talk, General Assembly of SFB/TRR-55, June 2, 2017, Regensburg (Germany)
    • EoCoE Face-to-Face Meeting Autumn 2017,  November 29, 2017, Toulouse (France)
  • J. Eitzinger: Defining upper performance bounds using analytic performance models – Opportunities and Limitations, Dagstuhl Seminar 17431 “Performance Portability in Extreme Scale Computing: Metrics, Challenges, Solutions”, Schloß Dagstuhl, October 22, 2017
  • J. Eitzinger: Introduction and Demo: Likwid Performance Tools. Seminar Talk at University Regensburg, October 20, 2017
  • G. Hager: The curses and blessings of analytic performance modeling. Invited talk at PPAM‘2017, the 12th International Conference on Parallel Processing and Applied Mathematics, Lublin, Poland, September 10-13, 2017.
  • J. Eitzinger: Evaluation of Intel Xeon Phi “Knights Landing”:  Initial impressions and benchmarking results, Prace Xeon Phi User Forum, LRZ Garching, June 28, 2017.
  • G. Hager: Supercomputer: Mächtiges Werkzeug und Forschungsobjekt. Night of Science, Universität Frankfurt, 9. Juni 2017.
  • G. Hager: Thirteen modern ways to fool the masses with performance results on parallel computers. Evening talk at the Course on “Parallel Programming of High Performance Systems 2017”, LRZ Garching, March 6-10, 2017.
  • G. Hager: Making sense of temporally blocked stencil performance via analytic modeling. Invited talk at the 7th AICS International Symposium, Integrated Research Center of Kobe University, Kobe, Japan, February 23-24, 2017.

2016

  • T. Gruber: Performance Analysis with LIKWID on IBM POWER8 chips. Talk at PADC Workshop 2016 at JSC Jülich, Jülich, Germany, October 18, 2016.
  • T. Gruber: Performance Analysis with LIKWID. Talk at “Parallel Programming in Computational Engineering and Science 2016” at IT Center of RWTH Aachen University, Aachen, Germany, March 17, 2016.
  • G. Hager: Performance Engineering for Algorithmic Building Blocks in GHOST. Talk at the ESSEX Minisymposium at the SPPEXA Symposium 2016, LRZ Garching, Germany, January 25-27, 2016
  • J. Hammer: From Regular to Irregular Algorithm Performance Modeling. Talk at UT Austin, May 6, 2016.
  • J. Eitzinger: Pattern-driven Performance Engineering. Talk at NEC User group, Osaka, Japan, May 23, 2016.
  • J. Eitzinger: Evaluation of Intel Xeon Phi “Knights Corner”: Opportunities and Shortcomings, Prace Xeon Phi User Forum, LRZ Garching, June 29, 2016.
  • J. Hammer: Modeling Approaches of Graph Algorithm Performance. Talk at UT Austin, August 12, 2016.
  • J. Hammer: From Tool Supported Performance Modeling of Regular Algorithms to Modeling of Irregular Algorithms. Talk at the Scalable Tools Workshop, Lake Tahoe, CA, August 2, 2016.
  • J. Hammer: Kerncraft: A Tool for Analytic Performance Modeling of Loop Kernels. Talk and poster at the International Parallel Tools Workshop, HLRS Stuttgart, October 5, 2016.
  • J. Eitzinger: Thoughts on Whitebox-Performance Modeling. Talk at the University of Amsterdam, February 15, 2016.
  • J. Hammer: Automatic Loop Kernel Analysis and Performance Modeling with Kerncraft. Talk at the University of Amsterdam, February 15, 2016.

2015

  • J. Eitzinger: Systematic Node-Level Performance Engineering. Seminar talk, UT Austin, Austin, TX, USA, February 2, 2015.
  • J. Eitzinger: Employing the ECM performance model on SIMD Kahan summation. Talk at the Workshop on Programming Models for SIMD/Vector Processing, San Francisco, CA, USA, February 8, 2015.
  • J. Eitzinger: Evaluierung von Co-Array Fortran als Alternative zu MPI. Talk at Parallel 2015 Konferenz, Karlsruhe, Germany, April 22, 2015.
  • J. Eitzinger: Introducing the ECM diagnostic performance model. Invited talk at Scalperf 2015, Bertinoro, Italy, September 23, 2015.
  • G. Hager: Systematic Node-Level Performance Engineering. Talk at the SPEC DevOps Meeting, University of Würzburg, Germany, February 20, 2015.
  • G. Hager: Insight into stencil performance by analytic modeling. Talk at the Dagstuhl Seminar on Advanced Stencil Code Engineering, Schloss Dagstuhl, Wadern, Germany, April 13-17, 2015.
  • G. Hager: White-box modeling for performance and energy: Useful patterns for resource optimization. Invited lecture at PACO 2015, the Workshop on Power-Aware Computing, Max Planck Institute for Dynamics of Complex Technical Systems, Magdeburg, Germany,  July 6-7, 2015.
  • G. Hager: Model-guided performance engineering of numerical kernels. Invited talk at the meeting of the SFB Transregio 55 “Hadron Physics from Lattice QCD,” University of Wuppertal, Germany, July 10, 2015.
  • G. Hager: Holistic node-level performance engineering for maximum resource efficiency on modern multi-core CPUs. Talk at ParisTech TELECOM, Paris, France, September 7, 2015.
  • J. Hammer: Automatic Loop Kernel Analysis and Performance Modeling. Talk at the workshop on “Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems”, Supercomputing 2015, Austin, TX, November 13, 2015.
  • M. Kreutzer: Performance Engineering of the Kernel Polynomial Method on Large-Scale CPU-GPU Systems. Talk at the “2015 IEEE International Parallel and Distributed Processing Symposium” (IPDPS), Hyderabad, India, May 26, 2015.
  • F. Shahzad: Building a Fault Tolerant Application Using the GASPI Communication Layer. Paper presentation at the “1st International Workshop on Fault Tolerant Systems” (FTS2015), in conjunction with IEEE Cluster 2015, Chicago, IL, September 8, 2015.
  • M. Wittmann: Performance Modeling and Analysis of Stencil operations in Earth Mantle Convection Simulations. Talk at ParCo 2015, Symposium on Parallel solvers for very large PDE based systems in the earth and atmospheric sciences, Edinburgh, Scotland, September 1-4, 2015.
  • M. Wittmann: Locality and Performance Optimized Adjacency List Generation for Lattice Boltzmann Based Simulations. Talk at ParCFD 2015, Montreal, Canada, May 17-21, 2015.