Tuning MPI Applications for Peak Performance

MPI is now widely accepted as a standard for message-passing parallel computing libraries. Both applications and important benchmarks are being ported from other message-passing libraries to MPI. In most cases it is possible to make a translation in a fairly straightforward way, preserving the semantics of the original program. On the other hand, MPI provides many opportunities for increasing the performance of parallel applications by the use of some of its more advanced features, and straightforward translations of existing programs might not take advantage of these features. New parallel applications are also being written in MPI, and an understanding of performance-critical issues for message-passing programs, along with an explanation of how to address these using MPI, can provide the application programmer with the ability to provide a greater percentage of the peak performance of the hardware to his application. This tutorial discusses performance-critical issues in message passing programs, explain how to examine the performance of an application using MPI-oriented tools, and show how the features of MPI can be used to attain peak application performance.

The Presentation

The talk, Tuning MPI Applications for Peak Performance, is available in several forms:

A short version of this talk is also available in HTML form. Also available are indvidual parts of the tutorial:

  • Introduction to MPI Message Passing
  • Message Protocols
  • Example of Message Blocking
  • Performance Modeling
  • MPI Datatypes and Performance
  • Tuning for MPI Protocols
  • The Programs

    The programs used in this tutorial are available in Tutorial examples. Part 3 of these examples (in the directory src3) may be used to characterize an MPI implementation. A gzip'ed tar file containing the programs is available.