[Beowulf] (no subject)

maqsood at chep.pu.edu.pk maqsood at chep.pu.edu.pk
Wed May 4 20:08:44 PDT 2005

I am trying to setup MPICH-1.2.6 on a 32 nodes (dual cpu) cluster. I
installed MPI under /usr/local/mpich-1.2.6 and followed following
./configure --with-device=ch_p4

/util/machine.LINUX consist of

When I run
mpirun -v -nolcal -np 1 cpi it gives following output

running /usr/local/mpich-1.2.6/bin/cpi on 1 LINUX ch_p4 processors
Created /usr/local/mpich-1.2.6/bin/PI1529
Process 0 of 1 on node1.chep.pu.edu
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 0.000666

but when I run on two or more nodes its give following error.

mpirun -v -nolocal -np 2 cpi
running /usr/local/mpich-1.2.6/bin/cpi on 2 LINUX ch_p4 processors
Created /usr/local/mpich-1.2.6/bin/PI1371
rm_3765:  p4_error: rm_start: net_conn_to_listener failed: 32908
p0_5964:  p4_error: Child process exited while making connection to remote
process on node2: 0

ssh, nfs and nis are working fine.

please help me to solve this problem.

Maqsood Ahmed
Assistant Professor
Centre for High Energy Physics
University of the Punjab

