[Beowulf] MPICH vs. OpenMPI
Hakon.Bugge at scali.com
Fri Apr 25 03:07:28 PDT 2008
At Wed, 23 Apr 2008 20:37:06 +0200, Jan Heichler <jan.heichler at gmx.net> wrote:
> >From what i saw OpenMPI has several advantages:
>- better performance on MultiCore Systems
>because of good shared-memory-implementation
A couple of months ago, I conducted a thorough
study on intra-node performance of different MPIs
on Intel Woodcrest and Clovertown systems. I
systematically tested pnt-to-pnt performance
between processes on a) the same die on the same
socket (sdss), b) different dies on same socket
(ddss) (not on Woodcrest of course) and c)
different dies on different sockets (ddds). I
also measured the message rate using all 4 / 8
cores on the node. The pnt-to-pnt benchmarks used
was ping-ping, ping-pong (Scalis `bandwidth´ and osu_latency+osu_bandwidth).
I evaluated Scali MPI Connect 5.5 (SMC), SMC 5.6,
HP MPI 188.8.131.52, MVAPICH 0.9.9, MVAPICH2 0.9.8, Open MPI 1.1.1.
Of these, Open MPI was the slowest for all
benchmarks and all machines, upto 10 times slower than SMC 5.6.
Now since Open MPI 1.1.1 is quite old, I just
redid the message rate measurement on an X5355
(Clovertown, 2.66GHz). On an 8-byte message size,
OpenMPI 1.2.2 achieves 5.5 million messages per
seconds, whereas SMC 5.6.2 reaches 16.9 million
messages per second (using all 8 cores on the node, i.e., 8 MPI processes).
Comparing OpenMPI 1.2.2 with SMC 5.6.1 on
ping-ping latency (usec) on an 8-byte payload yields:
mapping OpenMPI SMC
sdss 0.95 0.18
ddss 1.18 0.12
ddds 1.03 0.12
So, Jan, I would be very curios to see any documentation of your claim above!
Disclaimer, I work for Scali and may be biased.
More information about the Beowulf