Bizarre problems when adding a PPC machine...
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
John Nelson john at computation.comSun Jan 13 21:57:27 PST 2002
- Previous message: 8 node cluster
- Next message: Bizarre problems when adding a PPC machine...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all, I really hate to bother the mailing list but this one has me somewhat stumped. I have a four node cluster comprising Linux machines and one PPC machine. The Linux machines have been adequately tested and play well together. That PPC machine is another matter. When I include the PPC machine (a Mac 8500 running YellowDog Linux) in my network cluster... well things fall apart. Here's what appears on the console after running a simple test on my "root" node.... [john at adenine examples]$ ./mpirun -np 4 simpleio p2_9722: p4_error: Could not allocate memory for commandline args: 553648128 bm_list_24602: (4.056938) Listener: Unable to interrupt client pid=24601. Connection failed for reason: : Connection refused p1_1962: p4_error: net_recv read: probable EOF on socket: 1 [john at adenine examples]$ Connection failed for reason: : Connection refused p3_1283: p4_error: net_recv read: probable EOF on socket: 1 bm_list_24602: (4.076335) Listener: Unable to interrupt client pid=24601. Connection failed for reason: : Connection refused Connection failed for reason: : Connection refused Broken pipe Connection failed for reason: : Connection refused Connection failed for reason: : Connection refused Connection failed for reason: : Connection refused Broken pipe Connection failed for reason: : Connection refused Broken pipe bm_list_24602: p4_error: net_recv read: probable EOF on socket: 1 Connection refused is a strange strange message because RSH seems to be working well as do other networking applications. I imagine that one reason could be MPICH version differences between the different architectures. These are the versions of the RPM libraries installed: PPC: mpich-1.2.0-1a Linux: mpich-1.2.0-12 But I also compiled and installed the source code on both classes of machines. Any ideas. Its probably something simple but being a Beowulf newbie, its beyond me right now. -- John -- _________________________________________________________ John T. Nelson President | Computation.com Inc. mail: | john at computation.com company: | http://www.computation.com/ journal: | http://www.computation.org/ _________________________________________________________
- Previous message: 8 node cluster
- Next message: Bizarre problems when adding a PPC machine...
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
