Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: [Linux-HA] Couldn't get watchdog to work

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Alex Vrenios alex at DSRLab.com
Tue Dec 28 09:14:16 PST 2004


> -----Original Message-----
> Paul Chen wrote:
> > Both nodes did restart 
> > heartbeat but none of them reboot or shut down. Am I doing 
> > something wrong?
> >
> Alan Robertson wrote:
> The watchdog timer will only kill the system if heartbeat goes insane.
> It didn't.  So, the watchdog timer is happy.
> 
> At this point in time, the watchdog timer is not a 
> replacement for a STONITH device.
>
Which is exactly what I am looking into (the STONITH device)...

I see two solutions, one hardware and one software. The hardware solution
looks expensive, but I believe the software solution will help Mr. Chen
(above), and would appreciate comments.

I would have my "backup" system execute a command as part of its attempts to
assume the identity, responsibilities and resources of the "primary" system.
The command is run from backup, as follows:

   root at backup> ssh root at primary shutdown -h now

This will not work in all cases, but it should work in cases like the above.
A hardware solution is more general, but it doesn't hurt to run this command
in any case.

Alex Vrenios
DSRLab





More information about the Beowulf mailing list