Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Michael H. Frese Michael.Frese at NumerEx-LLC.com
Mon Nov 16 09:49:23 PST 2009


Martin,

Could it be that your MPI library was compiled using a small memory 
model?  The 180 million doubles sounds suspiciously close to a 2 GB 
addressing limit.

This issue came up on the list recently under the topic "Fortran 
Array size question."


Mike

At 05:43 PM 11/14/2009, Martin Siegert wrote:
>Hi,
>
>I am running into problems when sending large messages (about
>180000000 doubles) over IB. A fairly trivial example program is attached.
>
># mpicc -g sendrecv.c
># mpiexec -machinefile m2 -n 2 ./a.out
>id=1: calling irecv ...
>id=0: calling isend ...
>[[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 
>error polling LP CQ with status LOCAL LENGTH ERROR status number 1 
>for wr_id 199132400 opcode 549755813  vendor error 105 qp_idx 3
>
>This is with OpenMPI-1.3.3.
>Does anybody know a solution to this problem?
>
>If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
>and never returns.
>I asked on the openmpi users list but got no response ...
>
>Cheers,
>Martin
>
>--
>Martin Siegert
>Head, Research Computing
>WestGrid Site Lead
>IT Services                                phone: 778 782-4691
>Simon Fraser University                    fax:   778 782-4242
>Burnaby, British Columbia                  email: siegert at sfu.ca
>Canada  V5A 1S6
>
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit 
>http://www.beowulf.org/mailman/listinfo/beowulf





More information about the Beowulf mailing list