[Beowulf] using watchdog timers to reboot a hung system automagically: Good idea or bad?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Greg Lindahl lindahl at pbm.comFri Oct 23 14:08:41 PDT 2009
- Previous message: [Beowulf] using watchdog timers to reboot a hung system automagically: Good idea or bad?
- Next message: [Beowulf] eth-mlx4-0/15
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Fri, Oct 23, 2009 at 09:01:28AM -0700, ed in 92626 wrote: > You could also do something at the system level to prevent it. If the system > boots and the previous_uptime is less that one hour shut down the system. > The WD timer will not wake it up. You have 2 power failures 15 minutes apart. Your entire cluster shuts down. It's turtles, all the way down. -- greg
- Previous message: [Beowulf] using watchdog timers to reboot a hung system automagically: Good idea or bad?
- Next message: [Beowulf] eth-mlx4-0/15
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
