[Beowulf] (no subject)

maqsood at chep.pu.edu.pk maqsood at chep.pu.edu.pk
Wed May 4 20:08:44 PDT 2005


I am trying to setup MPICH-1.2.6 on a 32 nodes (dual cpu) cluster. I
installed MPI under /usr/local/mpich-1.2.6 and followed following
procedure.
./configure --with-device=ch_p4
make

/util/machine.LINUX consist of
node1:2
node2:2
.
.
node32:2

When I run
mpirun -v -nolcal -np 1 cpi it gives following output

running /usr/local/mpich-1.2.6/bin/cpi on 1 LINUX ch_p4 processors
Created /usr/local/mpich-1.2.6/bin/PI1529
Process 0 of 1 on node1.chep.pu.edu
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 0.000666

but when I run on two or more nodes its give following error.

mpirun -v -nolocal -np 2 cpi
running /usr/local/mpich-1.2.6/bin/cpi on 2 LINUX ch_p4 processors
Created /usr/local/mpich-1.2.6/bin/PI1371
rm_3765:  p4_error: rm_start: net_conn_to_listener failed: 32908
p0_5964:  p4_error: Child process exited while making connection to remote
process on node2: 0

ssh, nfs and nis are working fine.

please help me to solve this problem.

Maqsood Ahmed
Assistant Professor
Centre for High Energy Physics
University of the Punjab





More information about the Beowulf mailing list