[Beowulf] An annoying MPI problem
landman at scalableinformatics.com
Thu Jul 10 08:02:12 PDT 2008
Lombard, David N wrote:
>> I'll try all the usual things (reduce the optimization level, etc).
>> Sage words of advice (and clue sticks) welcome.
> Not trying to sound like an ad...
> The currently shipping Intel Trace Collector and Analyzer (7.1), includes
> message correctness checking. An option is available that adds a
> library to an Intel MPI build that checks messages during the run.
> You can then view any errors it found in the Intel Trace Analyzer.
> This may find there's a problem that has only just started to trip the
> code up. I certainly have welts from those; I suspect others do too.
Actually, Intel MPI and related tools are in general one of the things
we want to try. User may be open to that (especially if it is more pain
free than the alternative).
We have reliable functional non-sm/non-ib based execution on multiple
machines now. New code drop coming, so we have to wait on that. Once
we have that, we'll be doing more testing.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf