[Beowulf] IB DDR: mvapich2 vs mvapich performance

Mikhail Kuzminsky kus at free.net
Thu Apr 24 10:37:07 PDT 2008

In message from "Tom Elken" <tom.elken at qlogic.com> (Thu, 24 Apr 2008 
09:31:16 -0700):
>> I have up to 1453 MB/s for 4MB message size ... on osu_bw test
>w/Mellanox DDR IB 
>> (Mellanox version of OFED-1.2, w/binary mvapich-0.9.9); OpenMPI
>-1.2.2-1 gives even a 
>> bit more (1470 MB/s - more exactly, 1469.753328,  1469.447179, 
>> 1469.977840 for 3 subsequent test runs).
>> The SC'07 message of D.K.Panda
>> http://mvapich.cse.ohio-state.edu/publications/sc07_mpich2_bof.pdf
>> inform us about 1405 MB/s.
>> Is this throughput difference the result of MPI-2 vs MPI 
>> implementation or should I beleive that this difference (about 4% 
>> my mvapich vs mvapich2 at SC'07 ) is not significant  - in the sense 
>> that it is simple because of some measurement errors (inaccuracies)? 
>The way to see if there is a real throughput difference between a 
>MPI-2 implementation and a MPI-1 implementation is to measure it on 
>your pair of machines.

:-) Of course - but I've the problem w/mvapich2 (from binary 
Mellanox/ofed-1.2) setting. 

When I try to run mpdboot (/etc/mpd.conf contains the same 
MPD_SECRETWORD оn both nodes; 

mpdboot -v -n 2 -f /where/is/mpihosts
mpdroot: perror msg: No such file or directory
running mpdallexit on <node1_shortname>
LAUNCHED mpd on  <node1_shortname> via
RUNNING: mpd on <node1_shortname>
LAUNCHED mpd on <node2_FQDN>  via  <node1_shortname> RUNNING: mpd on 
/var/log/messages contains:

Apr 22 21:20:53 <node1_shortname> python2.4: mpdallexit: 
mpd_uncaught_except_tb handling:   exceptions.
TypeError: not all arguments converted during string formatting
/usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdlib.py  899  __init__ 
        mpd_print(1,'forked process failed; status=' % status) 
    /usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py  44  mpdallexit 
        conSock = 
    /usr/mpi/gcc/mvapich2-0.9.8-12/bin/mpdallexit.py  59  ? 
Apr 22 21:20:53  <node1_shortname> mpd: mpd starting; no mpdid yet
Apr 22 21:20:53 <node1_shortname> mpd: mpd has 
mpdid=<node1_shortname>_40611 (port=40611)
Apr 22 21:21:01 <node1_shortname>  kernel: ib0: multicast join failed 
for ff12:601b:ffff:0000:0000:0001:ff22:e50d, status -22
Apr 22 21:21:33 c5ws7 last message repeated 2 times

... and I don't understand (even from strace output :-)) which
file want mpdboot/mpdroot :-(

>Comparing your results to published results are difficult because 
>all the variables need to be the same for the comparison to be valid.
>Variables like which of the following were used in the two tests:
>- Mellanox IB DDR adapter
>- PCIe interface type
>- CPU model and speed
>- PCIe chipset
>- OFED version, ...
>Certainly the MPI flavor and version is important, but it is not, in
>general, the most important of these factors.
>Note for example these two results on the OSU MVAPICH web pages:
>MVAPICH2  1-sided put throughput, measured with osu_bw:
>1405 MB/s:  ConnectX DDR, PCIe x8, EM64T 2.33 GHz quad-core CPU
>1481 MB/s:  MT25208 HCA silicon, PCIe x8, Intel Xeon 3.6 Ghz, EM64T
>Both are DDR IB adapters.  ConnectX is the newer silicon.  But because
>of system differences, the older adapter is faster, in this case.

Thanks for this reference ! I thought that on my more old HCA hardware 
(Infinihost III Lx PCI-e x8 MHGS18-XTC), more old CPU/mobo/... 
(Opteron 246/2 Ghz/...), more old Linux, ofed and mvapich/mvapich2 
versions I must obtain more lower throughput value ...


>> Mikhail Kuzminsky
>> Computer Assistance to Chemical Research Center
>> Zelinsky Institute of Organic Chemistry
>> Moscow
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) 
>> visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list