⚡ Benchmarking Computation-Communication Overlap in MPI

In high-performance computing (HPC) applications, overlapping computation with communication is a critical optimization technique that can significantly improve performance, especially when using the Message Passing Interface (MPI) for distributed computing. However, achieving effective overlap between computation and communication in MPI applications is challenging due to the intricacies of hardware, network latencies, and MPI implementations.

This thesis will focus on benchmarking and analyzing the potential of computation-communication overlap in MPI applications. The student will design experiments to evaluate how well computation can overlap with communication for different MPI communication patterns (e.g., point-to-point, collective operations) across various network configurations. The student will explore optimization techniques, such as non-blocking communication, advanced MPI features, and network optimizations, and assess their impact on the overall application performance.

The work will involve running MPI applications on multiple testbed systems, including both smaller clusters and large-scale HPC environments, and measuring the performance improvements (in terms of execution time, bandwidth utilization, and scaling). The student will also explore the impact of different interconnects (e.g., InfiniBand, Ethernet) and their role in enabling or limiting effective overlap.

Skills required:

  • Strong understanding of MPI programming and parallel computing
  • Proficiency in C/C++ programming

Approximate composition: 25% State of the art analysis, 25% Theory/Design, 50% Implementation/Experiments