MPICH Buglist for version 1.2.7p1
This file contains a list of known bugs for the MPICH implementation of MPI.
Patches for some of these exist are indicated by a link to patchfile.
The numbers refer to the bug report number assigned by our bug tracking system.
If you wish to submit a new bug report on MPICH, send mail to
mpi-bugs@mcs.anl.gov .
A non HTML Table
version of this file is available.
To use these patch files, download the file, place it in your mpich directory, and execute
patch -p0 < patchfile
It is important to download the patch, not cut and paste it from the display
of your web browser. On many browsers, right-clicking will the link will
download the file. The reason for this is that some patches may contain tabs,
and most browsers will not preserve the tabs when they display the
file; copying and pasting that text will cause the patch to fail because the
tabs won't be found.
If you have trouble with patch, you can look at the patchfile itself;
it is just the output of the Unix diff program applied to the old and
new versions of the file or files.
| Bug Number | Device |
Description | Patchfile |
|---|
Fortran Compilers
Some users have had difficulty when mixing C and Fortran compiler vendors.
For example, when using GNU gcc and the Portland Group pgf77, C programs would
fail to link. This is because some of the code used to provide definitions
of some Fortran types is computed at MPI_Init time in MPICH 1.2.7 by using a
small Fortran routine, and in some cases, these routines refer to symbols
that are defined only in the Fortran run time libraries. To fix this, you
need to add the Fortran libraries when building C programs. You can do this
with the -lib switch in configure:
configure -cc=gcc -fc=pgf77 -lib="-L/usr/local/pgi/linux86/lib -lpgftnrtl -lpgc" ...
The next release of MPICH will attempt to avoid the need for these libraries
when possible and will determine them for you otherwise.
If you are using a Fortran 90 compiler as your Fortran compiler (e.g.,
-fc=pgf90), you'll need a longer list of libaries. Look in the library
directory for your Fortran compiler for the libraries. For example,
configure -cc=gcc -fc=pgf90 -lib="-L/usr/local/pgi/linux86/lib -lpgf90 -lpgf90rtl -lpgftnrtl -lpgc" ...
Redhat 7.0 Compilation Problems
Redhat 7.0 ships with a version of gcc that is a development version
and should not have been used by Redhat. See http://gcc.gnu.org/gcc-2.96.html
for GNU's discussion of this. Also see patch 5597.
LINUX TCP Performance
There is a bug in the implementation of TCP in some versions of Linux. This
has been documented by Josip Loncaric. His
description of the problem
and fix are available on the Web. He reports that as of Linux 2.4.3, this
TCP patch is no longer necessary.
Fortran 77 and Fortran 90/95
MPICH currently supports Fortran 77. There is some support for using Fortran
90/95, but this is still imperfect. There is a table
that shows some of the combinations of configure options that can be used to
make MPICH work with Fortran 90 and Fortran 95 compilers.
Failures in system includes on Solaris
Some users have had problems with the make failing when compiling the routines
in mpid/ch_p4 on Solaris. A typical output is
...
In file included from /usr/include/rpc/rpc.h:38,
from p4/include/p4.h:16,
from chdef.h:9,
from ../ch2/packets.h:350,
from mpiddev.h:15,
from adi2recv.c:9:
/usr/include/rpc/auth_des.h:58: field `adv_ctime' has incomplete type
...
This appears to be caused by the MPICH configure deciding to reject cc
as the C compiler (because it does not support function prototypes) and picking
gcc instead. Unfortunately, configure is still using cc
for the C preprocessor, and gets confused (this is actually a bug in autoconf;
it should not be using a C preprocessor to decide what the C compiler, whose
search paths might depend on various options, will find).
To fix this, add -cc=gcc to the configure line, reconfigure and
remake.
Unsolved Problems
errno=110 or connection timed out in Linux
This indicates that Linux has closed a TCP connection. This can happen if,
just when MPICH is trying to communicate a message, the network interconnect
becomes very busy. Other operating systems have less fragile TCP connections.
We are working on a fix for this, but a better one would be for Linux to
provide more robust TCP connections.
Problems with MPI-IO and NFS
The network file system (NFS) must be configured extremely carefully for
MPI-IO (and many other programs) to work correctly. Unfortunately, few systems
are so configured, and doing so can adversely impact performance. As a result,
programs using files on an NFS system may hang or produce incorrect results.
Note that this is, officially, a design feature of NFS; unless the
NFS system is configured with no attribute caching, any two
processes, accessing the same file, may produce incorrect results.
You can use the -file_system=ufs option of configure to build an MPICH that
supports only UFS (Unix File System); MPI-IO works correctly with UFS, XFS,
PIOFS, HFS, SFS, etc. (more precisely, those file system that
correctly implement basic Unix I/O system calls; something that NFS does not
do).
Detailed instructions on setting up NFS can be found in the installation and
users manual.
Problems not in MPICH
Some problems are caused by compiler problems. Some of the problems are
- Intel icc on Redhat 8
- The Intel Version 7.1 compilers are not compatible with the version of
GLIBC used in Redhat version 8. For many applications, this does not cause
a problem, but for parts of MPICH (particularly ROMIO), this incompatibility
can cause errors during compilation. Larry Baker of USGS has found the following workaround; here are his notes:
The Intel compilers support an older version of glibc than the one used in Red Hat 8.0. I have modified their
substitute headers for <sys/types.h>, <bits/types.h>, and <sys/stat.h> (not needed to compile MPICH). There are just a
couple preprocessor conditionals that must be modified in <sys/stat.h> and <sys.types.h>; the Intel version of
<bits/types.h> is no longer needed:
# cd /opt/intel/compiler70/ia32/substitute_headers/sys
# ls
stat.h stat.h.original types.h types.h.original
# diff /usr/include/sys/stat.h stat.h
353c353
< #if defined __GNUC__ && __GNUC__ >= 2
---
> #if (defined __GNUC__ && __GNUC__ >= 2) || defined __ICC
# diff /usr/include/sys/types.h types.h
158c158
< #if !__GNUC_PREREQ (2, 7)
---
> #if !__GNUC_PREREQ (2, 7) || defined (__INTEL_COMPILER)
# cd /opt/intel/compiler70/ia32/substitute_headers/bits
# ls
types.h.original
- Solaris C
- Version SC3.0.1 will fail with
cg: assertion failed in file ../src/regman/regman_reporter.h at line 36
cg: add_edges_to_new_node -- unexpected edges
cg: 42 warnings, 1 errors
cc: cg failed for ad_read_coll.c
Removing -O from all of the Makefiles (particularly in ROMIO) may fix this
- GNU gcc
- Version 2.8.x does not handle the command line argument -I./ correctly.
Compaq Fortran
Version 5.3-915 of f95 has a bug that causes it to fail with files that
have the extension .F (e.g., to be processed with the preprocessor). In this
case, specify the Fortran 90 compiler to be f90 instead of f95.
An update to Compaq Fortran 5.3 exists in Compaq's FTP
repository at ftp://ftp.compaq.com/pub/products/fortran/Tru64/.
Please see the file readme.txt in this directory.
You may also want to check on patches for the system that you are running
on.
- HP patch database:
- Europe or
US