[Beowulf] Update on mpi problem

Ashley Pittman apittman at concurrent-thinking.com
Thu Jul 10 04:44:26 PDT 2008


On Wed, 2008-07-09 at 23:58 -0400, Joe Landman wrote:
> Ok ... thought this would be interesting for some folks.  As a reminder, 
> using Open-MPI 1.2.6 for a customer code, seeing different behavior than 
> in the past.  Scratching my head over it (seemingly non-deterministic).
> 
> I tried using '--mca btl ^sm' (turn off shared memory usage) on the 
> non-infiniband machine, and ... it runs.  Repeatedly.  To completion.

See, I told you that would be a worthwhile test.

> It looks like this is an MPI stack issue of some sort.  I'll ping the 
> Open-MPI list and see what they think.

That doesn't necessarily follow, if you are posing your sends before
your receives then you are relying on unexpected message buffering
within the MPI library.  How much of this is available is up the the
library, not the standard so I think it's possible that openmpi is being
MPI compliant in both cases.

Ashley Pittman.




More information about the Beowulf mailing list