[Beowulf] Compare and contrast MPI implementations

Douglas Eadline deadline at clustermonkey.net
Wed Dec 14 11:46:39 PST 2005


> There are at least 4 free MPI implementations:
> name     version   MPI standard   where
> mpich    1.2.7p1   1.2         http://www-unix.mcs.anl.gov/mpi/mpich/
mpich2   1.0.3     2           http://www-unix.mcs.anl.gov/mpi/mpich2/
lam/mpi  7.1.1     1.2,some 2  http://www.lam-mpi.org/
> open-mpi 1.0       2           http://www.open-mpi.org/
> What advantages and disadvantages do each of these have?  I looked
around for a review article or something of that sort but didn't find 
anything.

It depends how you define "advantage" (and consequently disadvantage) for
instance, an all-to-all implementation feature may be an advantage to one
person while the 1000 byte latency may be a feature to someone else. I
think for most people there are two issues:

 * first will it link/run with their application
 * and second, how fast does it run their application?

The only way to truly answer this is to run the application.

I have in the past run the NAS suite with various MPI's and the only 
conclusion I can say is that most MPIs that run over TCP/IP are
"about the same", but there are times when one works much better than
others and this perforamnce changes as the MPI release/Linux
kernels/{fill-in-the-blank} changes.

As part of the BPS package, I have tried to make it easy to use various
MPI packages and therefore test different compiler MPI combinations. You
can find more here:

http://www.clustermonkey.net//content/view/38/34/

The idea behind the suite was to see what happens when things change. i.e.
Did the new {fill-in-the-blank} help or hurt performance. Which is the
same idea behing the new Cluster Monkey Benchmark Project
(which was the ClusterWorld Benchmarking Project, but has now
emerged from Cluster Monkey Land)

http://cmbp.clustermonkey.net/

I plan on working on this more next month. Stay tuned.

> The cluster in question has only 100baseT, so support for
> Myrinet and other faster interconnects doesn't matter for us. It currently
> has an older version of mpich 1 installed, but that sees no use since
the > commonly used software is all PVM.  I want to run
> gromacs now, and they
> suggest using lam/mpi, but don't say *why* they
> suggest it.

And not all 100/1000 Ethernet is the same. What may work well for  one
chipset, may not work well for another -- and so on. Plus
there is always some tuning and optimization that may help things.

They probably suggest LAM because it seems to work well, and they have
experience with it.

> Are the MPI 2 standards fully backwards compatible with code written to
the MPI 1.2 standard?

As far as I know, but some of the MPI jocks on this list can answer better
than I.



-- 
Doug






-- 
Doug








More information about the Beowulf mailing list