[Beowulf] IB DDR: mvapich2 vs mvapich performance
kus at free.net
Thu Apr 24 10:37:07 PDT 2008
In message from "Tom Elken" <tom.elken at qlogic.com> (Thu, 24 Apr 2008
>> I have up to 1453 MB/s for 4MB message size ... on osu_bw test
>w/Mellanox DDR IB
>> (Mellanox version of OFED-1.2, w/binary mvapich-0.9.9); OpenMPI
>-1.2.2-1 gives even a
>> bit more (1470 MB/s - more exactly, 1469.753328, 1469.447179,
>> 1469.977840 for 3 subsequent test runs).
>> The SC'07 message of D.K.Panda
>> inform us about 1405 MB/s.
>> Is this throughput difference the result of MPI-2 vs MPI
>> implementation or should I beleive that this difference (about 4%
>> my mvapich vs mvapich2 at SC'07 ) is not significant - in the sense
>> that it is simple because of some measurement errors (inaccuracies)?
>The way to see if there is a real throughput difference between a
>MPI-2 implementation and a MPI-1 implementation is to measure it on
>your pair of machines.
:-) Of course - but I've the problem w/mvapich2 (from binary
When I try to run mpdboot (/etc/mpd.conf contains the same
MPD_SECRETWORD оn both nodes;
MV2_DEFAULT_DAPL_PROVIDER=ib0), I see
mpdboot -v -n 2 -f /where/is/mpihosts
mpdroot: perror msg: No such file or directory
running mpdallexit on <node1_shortname>
LAUNCHED mpd on <node1_shortname> via
RUNNING: mpd on <node1_shortname>
LAUNCHED mpd on <node2_FQDN> via <node1_shortname> RUNNING: mpd on
Apr 22 21:20:53 <node1_shortname> python2.4: mpdallexit:
mpd_uncaught_except_tb handling: exceptions.
TypeError: not all arguments converted during string formatting
/usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdlib.py 899 __init__
mpd_print(1,'forked process failed; status=' % status)
/usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py 44 mpdallexit
/usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py 59 ?
Apr 22 21:20:53 <node1_shortname> mpd: mpd starting; no mpdid yet
Apr 22 21:20:53 <node1_shortname> mpd: mpd has
Apr 22 21:21:01 <node1_shortname> kernel: ib0: multicast join failed
for ff12:601b:ffff:0000:0000:0001:ff22:e50d, status -22
Apr 22 21:21:33 c5ws7 last message repeated 2 times
... and I don't understand (even from strace output :-)) which
file want mpdboot/mpdroot :-(
>Comparing your results to published results are difficult because
>all the variables need to be the same for the comparison to be valid.
>Variables like which of the following were used in the two tests:
>- Mellanox IB DDR adapter
>- PCIe interface type
>- CPU model and speed
>- PCIe chipset
>- OFED version, ...
>Certainly the MPI flavor and version is important, but it is not, in
>general, the most important of these factors.
>Note for example these two results on the OSU MVAPICH web pages:
>MVAPICH2 1-sided put throughput, measured with osu_bw:
>1405 MB/s: ConnectX DDR, PCIe x8, EM64T 2.33 GHz quad-core CPU
>1481 MB/s: MT25208 HCA silicon, PCIe x8, Intel Xeon 3.6 Ghz, EM64T
>Both are DDR IB adapters. ConnectX is the newer silicon. But because
>of system differences, the older adapter is faster, in this case.
Thanks for this reference ! I thought that on my more old HCA hardware
(Infinihost III Lx PCI-e x8 MHGS18-XTC), more old CPU/mobo/...
(Opteron 246/2 Ghz/...), more old Linux, ofed and mvapich/mvapich2
versions I must obtain more lower throughput value ...
>> Mikhail Kuzminsky
>> Computer Assistance to Chemical Research Center
>> Zelinsky Institute of Organic Chemistry
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe)
>> visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf