Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Martin Siegert siegert at sfu.ca
Mon Nov 16 13:01:02 PST 2009


On Sun, Nov 15, 2009 at 02:29:13PM -0800, Michael Di Domenico wrote:
> you might want to ask on the linux-rdma list (was openfabrics).  its
> been awhile since i looked at IB error messages, but what
> stack/version are you running?

This is under Scientific Linux 5.3 which is a RH 5.3 clone that comes
with OFED-1.3.2, which admittedly is quite old. Unfortunately,
upgrading this is a major forklift ... thus I must be sure that this is
really the problem. I'll do a few tests on a couple of nodes ...

Thanks!

- Martin

> On Sat, Nov 14, 2009 at 4:43 PM, Martin Siegert <siegert at sfu.ca> wrote:
> > Hi,
> >
> > I am running into problems when sending large messages (about
> > 180000000 doubles) over IB. A fairly trivial example program is attached.
> >
> > # mpicc -g sendrecv.c
> > # mpiexec -machinefile m2 -n 2 ./a.out
> > id=1: calling irecv ...
> > id=0: calling isend ...
> > [[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for wr_id 199132400 opcode 549755813  vendor error 105 qp_idx 3
> >
> > This is with OpenMPI-1.3.3.
> > Does anybody know a solution to this problem?
> >
> > If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
> > and never returns.
> > I asked on the openmpi users list but got no response ...
> >
> > Cheers,
> > Martin
> >
> > --
> > Martin Siegert
> > Head, Research Computing
> > WestGrid Site Lead
> > IT Services                                phone: 778 782-4691
> > Simon Fraser University                    fax:   778 782-4242
> > Burnaby, British Columbia                  email: siegert at sfu.ca
> > Canada  V5A 1S6



More information about the Beowulf mailing list