From: snir@XXXXXXXXXX X-Lotus-FromDomain: IBMUS To: Richard Treumann cc: mpi-comments@XXXXXXXXXXXXX,mpi-core@XXXXXXXXXXX Date: Mon, 3 Aug 1998 15:23:08 -0400 Subject: Re: MPI_ALLOC_MEM & no mem Mime-Version: 1.0 Content-type: text/plain; charset=us-ascii Content-Disposition: inline Sender: owner-mpi-core@XXXXXXXXXXX Precedence: bulk X-UIDL: af1c806b6a268eef92b398607968a00d The minimum change needed in the MPI standard to accomodate the issue that Dick is raising is to state that mpi_alloc_mem is associated by default with mpi_error_returns. Users will have to check the error code returned by the code, before they use the pointer returned by the call. This is somewhat different than the practice for malloc in C, but is consistent with MPI practice. If one wants to handle this issue with no changes in the standard, and no additional functions, then we can do the following: (1) have a new error handler that is not fatal for mpi_alloc_mem. This handler is not fatal for mpi_alloc_mem, and for any other mpi function where there is a consensus that fatal error is the wrong choice. (2) agree that a call to mpi_alloc_mem that fails does not cause any damage, expect perhaps returning a null pointer and certainly returning a suitable error code. We might like the new error handler to be the default, and the standard states that mpi_errors_are_fatal is the default. But an implementation can provide an argument to mpiexec in order to change the default, so no major pain, here. The major pain is lack of standardization, if each vendor starts deciding on its own which errors are nonfatal for which error handler. On the other hand, we could not agree in the forum meeting which errors are continuable and which are not, so is it realistic to do this by email? The following might be realistic: 1. agree on a name for this new error handler (e.g., mpi_errors_are_handled). 2. agree on a mimimum, and perhaps maximum list of errors that would be handled by this handler without being fatal and also agreeing on the process state upon return from an errorneous call. 3. hope (or fear) that customer pressure will lead vendors to standardize on the maximum list. Marc Snir, Senior Manager Scalable Parallel Systems, IBM T.J. Watson Research Center http://www.ibm.research.com/people/s/snir Tel: 914-945-3204 (8-862-3204) Fax: 914-945-4425 (8-862-4425)