Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Martin Siegert siegert at sfu.ca
Sat Nov 14 16:43:27 PST 2009


Hi,

I am running into problems when sending large messages (about
180000000 doubles) over IB. A fairly trivial example program is attached.

# mpicc -g sendrecv.c
# mpiexec -machinefile m2 -n 2 ./a.out
id=1: calling irecv ...
id=0: calling isend ...
[[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for wr_id 199132400 opcode 549755813  vendor error 105 qp_idx 3

This is with OpenMPI-1.3.3.
Does anybody know a solution to this problem?

If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
and never returns.
I asked on the openmpi users list but got no response ...

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Services                                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sendrecv.c
Type: text/x-c++src
Size: 1054 bytes
Desc: not available
Url : http://www.scyld.com/pipermail/beowulf/attachments/20091114/8a57ca20/sendrecv.bin


More information about the Beowulf mailing list