Mpptest
is a program that measures the performance of some of
the basic MPI message passing routines in a variety of situations. In
addition to the classic ping pong test, mpptest can measure performance with
many participating processes (exposing contention and scalability problems)
and can adaptively choose the message sizes in order to isolate sudden changes
in performance. In addition, mpptest
includes a simple "halo" or
"ghost cell" exchange test; such tests are often more representative of real
performance in applications.
Mpptest
was originally developed before the MPI standard, and
this is still reflected in some of the option names (e.g., -force
for ready-send). Mpptest
is distributed with the mpich implementation of MPI, but
can be used with any MPI implementation. A version of mpptest
for any MPI implementation is available.
mpptest
strives to overcome many of the pitfalls in measuring
communication performance that are detailed in How not
to Measure Communications Performance.
The design of mpptest
is described in the paper Reproducible
Measurements of MPI Performance Characteristics (as PDF), presented at the 1999
PVMMPI meeting.
Test a single communication link by various methods. The tests are combinations of
Protocol: | |
-sync | Blocking sends/receives (default) |
-async | NonBlocking sends/receives |
-force | Ready-receiver (with a null message) |
-persistant | Persistant communication (only with MPI) |
-put | MPI_Put (only on systems that support it) |
-get | MPI_Put (only on systems that support it) |
-vector | Data is separated by constant stride (only with MPI, using UBs) |
-vectortype | Data is separated by constant stride (only with MPI, using MPI_Type_vector) |
Message data: | |
-cachesize n | Perform test so that cached data is NOT reused |
-vstride n | For -vector, set the stride between elements |
Message pattern: | |
-roundtrip | Roundtrip messages (default) |
-head | Head-to-head messages |
-halo | Halo Exchange (multiple head-to-head; limited options) |
Message test type: | |
(if not specified, only communication tests run) | |
-overlap | Overlap computation with communication (see -size) |
-overlapmsgsize nn | Size of messages to overlap with is nn bytes. |
-bisect | Bisection test (all processes participate |
-bisectdist n | Distance between processes |
Message sizes: | |
-size start end stride | (default 0 1024 32) Messages of length (start + i*stride) for i=0,1,... until the length is greater than end. |
-sizelist n1,n2,... | Messages of length n1, n2, etc are used. This overrides -size |
-logscale | Messages of length 2**i are used. The -size argument may be used to set the limits. If -logscale is given, the default limits are from sizeof(int) to 128 k. |
-auto | Compute message sizes automatically (to create a smooth graph. Use -size values for lower and upper range |
-autodx n | Minimum number of bytes between samples when using -auto |
-autorel d | Relative error tolerance when using -auto (0.02 by default) |
Detailed control of tests: | |
-quick | Short hand for -autoavg -n_stable 5 this is a good choice for performing a relatively quick and accurate assessment of communication performance |
-n_avg n | Number of times a test is run; the time is averaged over this number of tests |
-autoavg | Compute the number of times a message is sent automatically |
-tgoal d | Time that each test should take, in seconds. Use with -autoavg |
-rthresh d | Fractional threshold used to determine when minimum time has been found. The default is 0.05. |
-sample_reps n | Number of times a full test is run in order to find the minimum average time. The default is 30. |
-n_stable n | Number of full tests that must not change the minimum average value before mpptest will stop testing. By default, the value of -sample_reps is used (i.e.,no early termination). |
-max_run_time n | Maximum number of seconds for all tests. |
-gop [ options ] | |
Collective Tests: | |
-dsum | reduction (double precision) |
-isum | reduction (integer) |
-sync | synchronization |
-colx | collect with known sizes |
-colxex | collect with known sizes with exchange alg. |
-scatter | scatter |
-bcast | another name for -scatter |
Collective test control: | |
-pset n-m | processor set consisting of nodes n to m |
Output | |
-cit | Generate data for CIt (default) |
-gnuplot | Generate data for GNUPLOT |
-givedy | Give the range of data measurements |
-fname filename | (default is stdout) (opened for append, not truncated) |
-noinfo | Do not generate plotter command lines or rate estimate |
-wx i n | windows in x, my # and total # |
-wy i n | windows in y, my # and total # |
-lastwindow | generate the wait/new page (always for 1 window) |
Pattern (Neighbor) choices: | |
-nbrring | neighbors are +/- distance |
-nbrdbl | neighbors are +/- 2**distance |
-nbrhc | neighbors are hypercube |
-nbrshift | neighbors are + distance (wrapped) |