What happens with a failed node? (Scyld)

Tony Stocker akostocker at hotmail.com
Wed Feb 6 09:48:36 PST 2002

Hi All,

Quick question.  What happens if a compute node fails or loses it network 
connectivity while processing something (non-parrallelized)?  How long does 
it take the host node to realize something is wrong?  What does the host 
node do then?  Does it send mail reporting which node went down and what was 
running on it at the time?

What about if the node was running a parrallelized program that is also 
being run by other elements of the cluster?  What's the node-fault 
procedures/setup in that case?

Thanks very much,

Tony Stocker

