Bizarre problems when adding a PPC machine...

John Nelson john at computation.com
Sun Jan 13 21:57:27 PST 2002


Hi all,

I really hate to bother the mailing list but this one has me somewhat 
stumped.  I have a four node cluster comprising Linux machines and one 
PPC machine.  The Linux machines have been adequately tested and play 
well together.  That PPC machine is another matter.  When I include the 
PPC machine (a Mac 8500 running YellowDog Linux) in my network 
cluster... well things fall apart.  Here's what appears on the console 
after running a simple test on my "root" node....


[john at adenine examples]$ ./mpirun -np 4 simpleio
p2_9722:  p4_error: Could not allocate memory for commandline args: 
553648128
bm_list_24602: (4.056938) Listener: Unable to interrupt client pid=24601.
Connection failed for reason: : Connection refused
p1_1962:  p4_error: net_recv read:  probable EOF on socket: 1
[john at adenine examples]$ Connection failed for reason: : Connection refused
p3_1283:  p4_error: net_recv read:  probable EOF on socket: 1
bm_list_24602: (4.076335) Listener: Unable to interrupt client pid=24601.
Connection failed for reason: : Connection refused
Connection failed for reason: : Connection refused
Broken pipe
Connection failed for reason: : Connection refused
Connection failed for reason: : Connection refused
Connection failed for reason: : Connection refused
Broken pipe
Connection failed for reason: : Connection refused
Broken pipe
bm_list_24602:  p4_error: net_recv read:  probable EOF on socket: 1


Connection refused is a strange strange message because RSH seems to be 
working well as do other networking applications.  I imagine that one 
reason could be MPICH version differences between the different 
architectures.  These are the versions of the RPM libraries installed:

    PPC: mpich-1.2.0-1a
    Linux: mpich-1.2.0-12

But I also compiled and installed the source code on both classes of 
machines.

Any ideas.  Its probably something simple but being a Beowulf newbie, 
its beyond me right now.

-- John

-- 
_________________________________________________________

John T. Nelson
President       |      Computation.com Inc.
mail:           |      john at computation.com
company:        |      http://www.computation.com/
journal:        |      http://www.computation.org/

_________________________________________________________





More information about the Beowulf mailing list