Fault tolerance and MPI
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Tony Skjellum tony at MPI-Softtech.ComMon Feb 5 06:49:12 PST 2001
- Previous message: Fault tolerance and MPI
- Next message: Alpha beowulf: True64 or Linux?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
You can see our initial paper on this subject at http://www.mpi-softtech.com/publications/mpift-paper-dsm2001.pdf It contains references to other known works in this area. -Tony Anthony Skjellum, PhD, President (tony at mpi-softtech.com) MPI Software Technology, Inc., Ste. 33, 101 S. Lafayette, Starkville, MS 39759 +1-(662)320-4300 x15; FAX: +1-(662)320-4301; http://www.mpi-softtech.com "Best-of-breed Software for Beowulf and Easy-to-Own Commercial Clusters." On Mon, 5 Feb 2001 Carl_Notfors at vdgc.com.sg wrote: > > > Our computational model is quite simple. We have a master node and a > number of slave nodes. All communication is between the master and the > slaves, ie. no internode communication, so all communication is done with > MPI_Send and MPI_Recv (we are using LAM/MPI). > > The problem with MPI is that there is no fault tolerance, if a slave node > "dies" the whole process goes down. According to the LAM documentation it > should be possible to achieve some fault tolerance but we have as yet not > tried this. > > Is there anyone who has got this working? Is there fault tolerance in any > othe MPI implementations? Would it be better to use PVM if you want fault > tolerance? > > > Carl > > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >
- Previous message: Fault tolerance and MPI
- Next message: Alpha beowulf: True64 or Linux?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
