[Beowulf] MPICH fault handling

Vinodh gvinodh1980 at yahoo.co.in
Sat Oct 30 00:38:35 PDT 2004


hello,
	i established a four node beowulf cluster using
MPICH.

while testing, i started mpd daemon in all the nodes
from the master by mpdboot, then i unplugged one slave
node from LAN, and now i tried to execute a program
using mpiexec, the master node is not recognising that
one of the node has failed.

then i checked in www.beowulf.org - Archives, the last
discussion about the mpi node failure was at Jan -
2003.

so now i want to know, whether there is any update of
MPI fault handling.

what can i do if
1. any slave node fails.
2. master node fails.


		
__________________________________
Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.
http://promotions.yahoo.com/new_mail 



More information about the Beowulf mailing list