[Beowulf] using watchdog timers to reboot a hung system automagically: Good idea or bad?

Greg Lindahl lindahl at pbm.com
Fri Oct 23 14:08:41 PDT 2009


On Fri, Oct 23, 2009 at 09:01:28AM -0700, ed in 92626 wrote:

> You could also do something at the system level to prevent it. If the system
> boots and the previous_uptime is less that one hour shut down the system.
> The WD timer will not wake it up.

You have 2 power failures 15 minutes apart. Your entire cluster shuts
down.

It's turtles, all the way down.

-- greg






More information about the Beowulf mailing list