Block-synchronous execution is a main source for parallel inefficiencies. To improve scalability of parallel codes, it can be crucial to replace block-synchronous execution by more fine-grained synchronization. OpenMP tasks with dependencies allow to express asynchronous execution with just the necessary synchronization at the process level. OpenMP 5.0 introduced detached tasks. In combination with MPI detached communication (aka. MPI continuations), detached tasks allow to build task dependency graphs across MPI processes. In this presentation you will learn how you can integrate MPI detached communication into your project and profit from real asynchronous communication. For an example code, we will compare the parallel performance of different levels of synchronization. If you don’t want to use OpenMP tasks, the same approach will also work with C++ futures/promises.
Joachim Jenke is a postdoctoral researcher with the IT Center of the RWTH Aachen University. He received his doctoral degree from the RWTH Aachen University in 2021. His research interests are focused on correctness and performance of HPC applications. As leader of the OpenMP tools subcommittee and member of the MPI tools working group he is interested in pushing both programming models to new limits. He is principle developer of the correctness analysis tools MUST and Archer.