Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Weird problem with mpp-dyna

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Joe Landman landman at scalableinformatics.com
Wed Mar 14 11:54:53 PDT 2007


Joshua Baker-LePain wrote:

>> Do you use a statically linked binary or did you relink it with your
>> mpich?
> 
> Agh.  I forgot to mention this little wrinkle.  LSTC software 
> distribution is... interesting.

Yup.  Caused us lot of fun at some customer sites.

>  For mpp-dyna, they ship dynamically 
> linked binaries compiled against a specific version of LAM/MPI (7.0.3 in 
> this case).

Yup.  Very hard to come by, that particular build.  Very hard.

> They also provide the matching pre-compiled LAM/MPI 
> libraries on their site. For a fun little wrinkle, RHEL/CentOS ships 
> LAM/MPI 7.0.6. However, the spec file in their RPM does *not* include 
> the --enable-shared flag.  IOW, the OS vendor's LAM/MPI package has no 
> .so files.

I rebuilt this (the LAM) for our customer.  Works nicely now.

> 
> It seems like it'd be worth re-compiling the centos lam RPM to include 
> the shared libraries and run against those to see if it helps.

Try an ldd against mpp-dyna-big-long-name

> 
>> We have ran lstc ls-dyna mpp970 and mpp971 across more than 16 nodes
>> without any issues on Scyld CW4 which is also centos 4 based.
> 
> We can run straight structural sims across as many nodes/CPUs as we've 
> tried, and ditto for straight thermal sims.  It's just on coupled 
> structural/thermal sims that this issue crops up.  That, to me, rather 
> points to a bug in dyna itself.  But the fact that the bug manifests 
> itself (at least in part) by the MPI job trying to talk to a different 
> network interface than was 'lamboot'ed is what is throwing me off a bit.



-- 

Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615




More information about the Beowulf mailing list