There is an MPI to channel layer that selects the protocol based on message size. This could even be a macro, using a global (or device specific?) value.
The check_incoming routine remains about the same, with unexpected
message processing doing this:
save information on the incoming message (tag, lrank, len, context)
based on device/packet type, save additional information
(e.g., data for eager send, send_id for get/rendezvous. Careful
of rendezvous for Ssend).
So, a send looks something like this
send( ..., dest, ... )The single device case can use just the if-else list (dev is fixed, use MPID_dev), and the fixed protocol case can use the explicit routines. The shared-memory version will use a slightly different version.{ MPID_Device *dev = devset->dev[dest]; if (len < dev->long_len) (*(dev->short->send))( ... ); else if (len < dev->vlong_len) (*(dev->long->send))( ... ); else (*(dev->vlong->send))( ... ); }
A receive is more complex, since MPI guarentees that a blocking receive on one communicator does not affect any other communication. So a multi-device blocking receive looks something like
Is message already available?
Yes ->
Call recv_unexpected code (see below)
No ->
(test has already added message to list of posted receives (for
thread safety))
(a nonblocking receive exits here)
while (message not received)
if (req->push) (req->push)( req )
for each device
check_device( (ndev > 1) ? non-blocking : blocking )
This assumes that, at least in the single device case, you can block in
check device to await a message. If this is not true, then the
check_device routine should return.
In the case of finding an unexpected message,
recv_unexpected
Save matched data (tag, source)
Check for message data available (short or eager delivery)
Save functions (test,wait,push)
if (req->push) (req->push)( req )
Note that this does not necessarily complete the message; this allows
non-blocking receives that match an unexpected message to act as non-blocking
operations.
Remaining concerns:
If the lowest levels send a message with a non-blocking operation, is it ever safe to wait on it? Or do we have to spin in a test loop? Should the wait entry be defined ONLY when it can be safely called? For example, in a rendezvous system, the non-blocking operation will be started only in response to a direct request; could you wait then? Do we leave this to the device? Implementation of push?
MPI-2 dynamic process management.
The biggest issue here is the support of dynamically resized ``global'' lists. Mostly, the implementation needs to remember that ``grank'' is not in MPI_COMM_WORLD but is the result of the lrank_to_grank mapping in the communicator.