[Beowulf] errors while testing machines

akhtar Rasool akhtar_samo at yahoo.com
Thu Dec 9 01:26:13 PST 2004


 

After the extraction of MPICH in /usr/local


 

1- tcsh               

2- ./configure –with-comm=shared --prefix=/usr/local

3-  make

4-  make install

5-  util/tstmachines

in the 5th step error was

Errors while trying to run  rsh 192.168.0.25 –n /bin/ls  /usr/local/mpich/mpich-1.2.5.2/mpichfoo     unexpected response from 192.168.0.25

 

n      > /bin/ls: /usr/local/mpich/mpich-1.2.5.2/mpichfoo:

n      no such file or directory

The ls test failed on some machines.

This usually means that u donot have a common filesystem on all of the machines in your machines list; MPICH requires this for mpirun (it is possible to handle this in a procgroup file; see the……)

Other possible problems include:-

The remote shell command rsh doesnot allow you to run ls.

See the doc abt remote shell & rhosts

 

You have common filesystem, but with inconsistent names

See the doc on the automounter fix

1 error were encountered while testing the machines list for LINUX

only these machines seem to be available

host1

 


 

 

    

now since this is only a two node cluster host1 is the server on to which MPICH is being installed. & 192.168.0.25 is the client…..

rsh on both nodes is logging freely…….

On the server side the file    “ machines.LINUX  “ contains   

-192.168.0.25

-host1

Kindly help

   

 

Akhtar


		
---------------------------------
Do you Yahoo!?
 Yahoo! Mail - Helps protect you from nasty viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20041209/9a613596/attachment.html


More information about the Beowulf mailing list