[Beowulf] /lib/tls/libc.so.6 or libc-2.3.5.so vmadump errors

scunacc scunacc at yahoo.com
Mon Feb 20 08:31:34 PST 2006


Dear folks,

I am trying to establish a clustermatic 5 setup on a 2.6.9 custom built
kernel backported to a stock Mandriva 2006 build (with all of the latest
patches applied as of Saturday)

No problem on the headnode kernel or the CM5 host utils booting.

However, the slaves *intermittently* do not properly copy the libs over
and I get 

vmadump: mmap failed: /lib/tls/libc.so.6

or

vmadump: mmap failed: /lib/tls/libc-2.3.5.so

Now, the first is a symlink to the other. 

Also, strace on a simple binary (e.g. mkdir, shows that it is indeed
trying to load *that* version of the C lib 1st.)

I've messed around with taking that out of the path and linking
libc.so.6 to various other libc*so*'s in /usr/lib or /lib, with the same
results. It will sometimes boot, sometimes not.

This looks like a random library ordering issue.

Or, perhaps a timing issue where something that is being called in the C
lib is causing vmadump to burp.

It's happening in the node_up stage tho' if it happens.

*Sometimes* the nodes will boot OK.

----

Note: I have a happily running CM5 setup on several other machines with
FC4 as the core OS and basically the same custom CM5 kernel on top -
it's something funky with the M2006 C libraries AFAICS. Threading
perhaps? Not sure. 

I have other reasons for going with M2006.

I didn't fancy backporting the basic bproc code to a 2.6.12* or 2.6.15
kernel, so I simply used (custom rebuilt the same as on the FC4
clusters) the 2.6.9 kernel from CM5.

Do let me know if you have any ideas.

Thanks!

Kind regards

Derek Jones.






More information about the Beowulf mailing list