Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] MPI_Isend/Irecv failure for IB and large message sizes

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Martin Siegert siegert at sfu.ca
Mon Nov 16 15:27:57 PST 2009


Hi,

On Mon, Nov 16, 2009 at 04:55:51PM -0500, Gus Correa wrote:
> Hi Martin
>
> We didn't know which compiler you used.
> So what Michael sent you ("mmodel=memory_model")
> is the Intel compiler flag syntax.
> (PGI uses the same syntax, IIRR.)

Now that was really stupid, I am using gcc-4.3.2 and even looked up
the correct syntax for the memory model, but nevertheless pasted the
Intel syntax into my configure script ... sorry.

> Gcc/gfortran use "-mcmodel=memory_model" for x86_64 architecture.
> I only used this with Intel ifort, hence I am not sure,
> but "medium" should work fine for large data/not-so-large program
> in gcc/gfortran.
> The "large" model doesn't seem to be implemented by gcc (4.1.2)
> anyway.
> (Maybe it is there in newer gcc versions.)
> The darn thing is that gcc says "medium" doesn't support building
> shared libraries,
> hence you may need to build OpenMPI static libraries instead,
> I would guess.
> (Again, check this if you have a newer gcc version.)
> Here's an excerpt of my gcc (4.1.2) man page:
>
>
>        -mcmodel=small
>             Generate code for the small code model: the program and its 
> symbols must be linked in the lower 2 GB of the address space.  Pointers 
> are 64 bits.  Pro-
>            grams can be statically or dynamically linked.  This is the 
> default code model.
>
>        -mcmodel=kernel
>            Generate code for the kernel code model.  The kernel runs in the 
> negative 2 GB of the address space.  This model has to be used for Linux 
> kernel code.
>
>        -mcmodel=medium
>            Generate code for the medium model: The program is linked in the 
> lower 2 GB of the address space but symbols can be located anywhere in the 
> address
>            space.  Programs can be statically or dynamically linked, but 
> building of shared libraries are not supported with the medium model.
>
>        -mcmodel=large
>            Generate code for the large model: This model makes no 
> assumptions about addresses and sizes of sections.  Currently GCC does not 
> implement this model.

I recompiled openmpi with -mcmodel=medium and -mcmodel=large. The program
still fails. The error message changes, however:

id=1: calling irecv ...
id=0: calling isend ...
mlx4: local QP operation err (QPN 340052, WQE index 0, vendor syndrome 70, opcode = 5e)
[[55365,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 error polling LP CQ with status LOCAL QP OPERATION ERROR status number 2 for wr_id 282498416 opcode 11046  vendor error 112 qp_idx 3

(strerror(112) is "Host is down", which is certainly not correct).
This now points to system libraries - libmlx4. Am I correct in assuming that
this is either an OFED problem or OpenMPI exceeding some buffers in OFED
libraries without checking?

> If you are using OpenMPI, "ompi-info -config"
> will tell the flags used to compile it.
> Mine is 1.3.2 and has no explicit mcmodel flag,
> which according to the gcc man page should default to "small".

Are you - in fact, is anybody - able to run my test program? I am
hoping that there is some stupid misconfiguration on the cluster
that can be fixed easily, without reinstalling/recompiling all
apps ...

> Are you using 16GB per process or for the whole set of processes?

I am running the two processes on different nodes (and nothing else
on the nodes), thus each process has the full 16GB available.
>
> I hope this helps,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------

Thanks!

- Martin

> Martin Siegert wrote:
>> Hi Michael,
>>
>> On Mon, Nov 16, 2009 at 10:49:23AM -0700, Michael H. Frese wrote:
>>> Martin,
>>>
>>> Could it be that your MPI library was compiled using a small memory 
>>> model?  The 180 million doubles sounds suspiciously close to a 2 GB 
>>> addressing limit.
>>>
>>> This issue came up on the list recently under the topic "Fortran Array 
>>> size question."
>>>
>>>
>>> Mike
>>
>> I am running MPI applications that use more than 16GB of memory - I do not 
>> believe that this is the problem. Also -mmodel=large
>> does not appear to be a valid argument for gcc under x86_64:
>> gcc -DNDEBUG -g -fPIC -mmodel=large   conftest.c  >&5
>> cc1: error: unrecognized command line option "-mmodel=large"
>>
>> - Martin
>>
>>> At 05:43 PM 11/14/2009, Martin Siegert wrote:
>>>> Hi,
>>>>
>>>> I am running into problems when sending large messages (about
>>>> 180000000 doubles) over IB. A fairly trivial example program is attached.
>>>>
>>>> # mpicc -g sendrecv.c
>>>> # mpiexec -machinefile m2 -n 2 ./a.out
>>>> id=1: calling irecv ...
>>>> id=0: calling isend ...
>>>> [[60322,1],1][btl_openib_component.c:2951:handle_wc] from b1 to: b2 
>>>> error polling LP CQ with status LOCAL LENGTH ERROR status number 1 for 
>>>> wr_id 199132400 opcode 549755813  vendor error 105 qp_idx 3
>>>>
>>>> This is with OpenMPI-1.3.3.
>>>> Does anybody know a solution to this problem?
>>>>
>>>> If I use MPI_Allreduce instead of MPI_Isend/Irecv, the program just hangs
>>>> and never returns.
>>>> I asked on the openmpi users list but got no response ...
>>>>
>>>> Cheers,
>>>> Martin
>>>>
>>>> --
>>>> Martin Siegert
>>>> Head, Research Computing
>>>> WestGrid Site Lead
>>>> IT Services                                phone: 778 782-4691
>>>> Simon Fraser University                    fax:   778 782-4242
>>>> Burnaby, British Columbia                  email: siegert at sfu.ca
>>>> Canada  V5A 1S6
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Martin Siegert
Head, Research Computing
WestGrid Site Lead
IT Services                                phone: 778 782-4691
Simon Fraser University                    fax:   778 782-4242
Burnaby, British Columbia                  email: siegert at sfu.ca
Canada  V5A 1S6



More information about the Beowulf mailing list