MPI dies

Tony Skjellum tony at MPI-Softtech.Com
Fri Sep 15 17:11:32 PDT 2000


Victor, while the MPI standard doesn't support this model, at least
MPI/Pro and MPICH tend to work in master slave mode when a slave node
dies.

A potential model for this is to build configuration like this

Split MPI_COMM_WORLD into sub-communicators array with the master as
node zero, and the Kth slave as node 1 of the Kth communicator.

Use these communicators for doing point-to-point messaging between
host/slaves.  

DO NOT DO COLLECTIVE COMMUNICATION between slaves.

FYI, you can do the same just using MPI_COMM_WORLD, but this seems safer.

Given these caveats, one believes that the application will survive the
death of one of the slaves.

However, the state of MPI_Finalize() is ambiguous.  MPI 1.x clarified the
Finalize as a barrier, so a hang should/could occur depending on the way
this barrier is implemented there, but you should be able to run.

Note, all of this describes behavior of two specific implementations, but
these implementations have bent to demands of users (maybe it started this
way by accident, but people like the behavior).  We don't know what LAM
does, and if it can work that way.

We have customers who operate under circumstances analogous to those
described above, and continue to compute for days/weeks even though slaves
die.  If the master dies, of course it is all over.

Tony

Anthony Skjellum, PhD, President (tony at mpi-softtech.com) 
MPI Software Technology, Inc., Ste. 33, 101 S. Lafayette, Starkville, MS 39759
+1-(662)320-4300 x15; FAX: +1-(662)320-4301; http://www.mpi-softtech.com
"Best-of-breed Software for Beowulf and Easy-to-Own Commercial Clusters."

On Fri, 15 Sep 2000, Victor Karyo wrote:

> Is there a technique to handle node failure?  Shortly, I'll be working on an 
> that algorithm is naturally parallel and divided into course-grain "blocks". 
>   I want to use a master/worker scheme.  The master is to be set to reissue 
> blocks if the block doesn't return from the worker fast enough on the 
> assumption the node has failed.  I know I can't rejoin a node after it 
> fails, but if the node fails will the whole app die?
> 
> Also, is there a way to detect the number of nodes other than at 
> initialization, so I can tell if a node has died?
> 
> (I plan on using MPI-Pro on a RH6.2 8-way single-proc Intel cluster with 
> 100mbps switched ethernet.)
> 
> Thanks
> Victor Karyo.
> 
> 
> 
> 
> There are some efforts to build fault tolerating MPI's, but standard
> MPI-1.x is supposed to kill the parallel application if a node dies,
> or else the underlying system must transparently solve the fault.
> 
> 
> Anthony Skjellum, PhD, President (tony at mpi-softtech.com)
> MPI Software Technology, Inc., Ste. 33, 101 S. Lafayette, Starkville, MS 
> 39759
> +1-(662)320-4300 x15; FAX: +1-(662)320-4301; http://www.mpi-softtech.com
> "Best-of-breed Software for Beowulf and Easy-to-Own Commercial Clusters."
> 
> On Thu, 14 Sep 2000, Horatio B. Bogbindero wrote:
> 
>  >
>  > what happens if a node in MPI dies? is the entire computation lost?
>  >
>  >
>  > ---------------------
>  > william.s.yu at ieee.org
>  >
>  > I bought some used paint. It was in the shape of a house.
>  > 		-- Steven Wright
>  >
>  >
>  >
>  > _______________________________________________
>  > Beowulf mailing list
>  > Beowulf at beowulf.org
>  > http://www.beowulf.org/mailman/listinfo/beowulf
>  >
> 
> 
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf
> 
> _________________________________________________________________________
> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com.
> 
> Share information about yourself, create your own public profile at 
> http://profiles.msn.com.
> 
> 
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf
> 





More information about the Beowulf mailing list