In applications where the time to send data between processes is large, it is often helpful to cause communication and computation to overlap. This can easily be done with MPI's non-blocking routines.
For example, in a 2-D finite difference mesh, moving data needed for
the boundaries can be done at the same time as computation on the
interior.
MPI_Irecv( ... each ghost edge ... );
MPI_Isend( ... data for each ghost edge ... );
... compute on interior
while (still some uncompleted requests) {
MPI_Waitany( ... requests ... )
if (request is a receive)
... compute on that edge ...
}
Note that we call MPI_Waitany several times. This exploits the
fact that after a request is satisfied, it is set to
MPI_REQUEST_NULL, and that this is a valid request object to
the wait and test routines.