[Beowulf] MPI ABI

Robert G. Brown rgb at phy.duke.edu
Tue Oct 11 06:09:00 PDT 2005


Patrick Geoffray writes:

> In retrospect, the choice of the MPI forum to not put constraints on the 
> implementation was a good one. Now that MPI is pervasive, it's 
> legitimate to care about an ABI, but it would certainly have been a 
> weight for the adoption of the standard back in the days.

It's worth remarking on MPI's overall history, as an interface developed
to BE an API and NOT an ABI, at the arm-twisting insistence of the US
government as it tired of spending literally millions to support
proprietary library interfaces on supercomputers only to have to spend
millions all over again porting when one supercomputer company would
either fold or release the Mark Umptillion New Version of their
interface.  The proprietary interfaces were just peachy keen for the
supercomputer vendors, of course, as they represented a very high cost
barrier to changing vendors.  After you've spent a year hand-coding your
CM5 to get decent efficiency, you aren't eager to go through the process
of porting to something else and wasting another year.

The vendors, of course, hated this, as vendors often do, but
participated with greater or lesser enthusiasm in the development of the
MPI standard as a PROGRAMMING INTERFACE to hide the details of the
actual hardware-driven ABI.  Needless to say, you got as much divergence
at even the PI level as they thought they could safely get away with
within something that could legitimately be said to be a "compliant MPI"
to keep those barriers at least a BIT above zero in height.

What we are seeing now is at least somewhat the same thing, only now due
to "competition" between MPIs.  If MPI was really a flat binary/library
level interface so that absolutely the same compiled code with no
modification whatsoever worked at the binary level across all MPIs,
there would be little or no user-level barrier to moving from one MPI to
another.  This is of course the goal of the Government and original user
base, only they viewed a recompile as a trivial barrier where it has
become something more than that.  The decision as to which MPI to use
would then have to be based on real added value -- support for
debugging, superior performance, GUIs, supplied libraries -- and not on
the artificial barrier of "if I change MPIs I'll have to spend two weeks
working on annoying things to get my program to recompile again and then
to run it through a set of validation runs to be sure that nothing
borked".

Unfortunately, it takes a really significant event for the government to
push a requirement for open systems library-level interoperability and
compatibility.  This is a huge shame -- it is one reason MS maintains a
stranglehold on so many institutions via its control over the Office
formats.  There is already a huge base of old word docs that are nearly
impossible to recover as no modern software can open and read them in;
ditto for many other wysiwyg formats (that invariably used some sort of
proprietary binary format for representing the stored image).  Users
were "taught" to keep their version of Windows and Office up to date and
use it aggressively to keep their file base up to date or risk being
unable to access their own critical organizational data in five to ten
years' time.

The government is one of the few forces that could mandate a proper MPI
ABI at this point in time; indeed, it may be that their original mandate
for a PI by extension still holds at the BI level, if anybody in the
appropriate government body was informed of the situation and persuaded
that the current state of affairs Costs the Government Money (as it
undoubtedly does).  It really should be possible to enforce at least a
lowest common denominator interface that permits vendors to release a
superset (so as not to squelch genuine innovation) with the restrictions
that:

  a) The core ABI itself cannot be altered; the superset must be a
STRICT superset that leaves the core alone.  An application compiled on
top of a compliant MPI with no calls to extension functions must run on
top of anybody's libraries and transparently recompile with anybody's
MPI to produce a binary that will STILL run on top of anybody's
libraries.

  b) The superset additions must be documented and approved by the MPI
governing body, which would do so only if there was clear evidence that
they were strictly necessary and that the same functionality could not
easily be achieved using the existing standard base or that there is a
clear long term need for the function to be added to the existing
standard base.

  c) The superset additions must be provisionally contributed to the
project and non-proprietary; that is, if anybody else wants to implement
the same features in their own MPIs they can, and if enough do they can
be added to the core ABI.

  d) In place of b and c, vendors should be encouraged to provide
SEPARATE LIBRARIES to support anything proprietary.  In fact, because of
symbol table issues, it is probably going to be difficult to achieve a
safe ABI unless superset additions are in a separate library anyway.
Library calls that cannot begin with e.g.  mpi_... unless approved by
the governing board etc as in a.  Library calls that run on TOP of the
core ABI, so that they can run on "anybody's" MPI libraries. Those
MPI-based libraries they can sell, support, do whatever they like with,
and they can be a legitimate added value for doing business with the
vendor that may well lock a user into continuing to do business with the
>>VENDOR<< without locking them into using the vendor's >>MPI<<, even as
a platform for the vendor's own value-added library.

That might be enough to ensure binary level code portability, a minimum
of government money wasted (throughout every single MPI-based
government-funded project) porting and reporting projects between MPI
variants, minimum of government money wasted by a project being locked
into a proprietary MPI when there are noncommercial/open source MPIs
that would serve as well or better.

It could also serve as a template for all sorts of places the government
could take similar action -- a requirement to use e.g. the ODF for all
government documents; a DOE requirement for SUS/posix compliance and
proper maintenance for all project-related code; an open source (or at
least open standard) requirement on all new code and projects.

    rgb

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20051011/6f2deb8a/attachment.bin


More information about the Beowulf mailing list