Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Monitoring crashing machines

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Robert G. Brown rgb at phy.duke.edu
Tue Sep 9 11:19:24 PDT 2008


On Tue, 9 Sep 2008, Carsten Aulbert wrote:

> My question now, is there a cute little way to gather all the console
> outputs of > 1000 nodes? The nodes don't have physical serial cables
> attached to them - nor do we want to use many concentrators to achieve
> this - but the off-the-shelf Supermicro boxes all have an IPMI card
> installed and SoL works quite ok.

Syslog-ng?  Popping a USB flash disk on them to use as an alternative
log location (if the kernel doesn't actively lock up on the disk error)?
Booting from a USB flash image or diskless, so that a disk crash is just
a disk crash?

    rgb

>
> Initially, conserver.com looked nice and we also found an IPMI interface
> for it, but that comes with two downsides: (1) it blocks IPMI access (I
> have yet to find out if a secondary user can use SoL when another user
> is using this already, but I doubt it) and (2) it simply does not catch
> messages appearing in dmesg (simple ones like plugging in a USB
> keyboard), but that may be a configuration problem on our side.
>
> Also we tried (r)syslog but somehow this does not get all the messages
> either, even when using something like *.* @loghost.
>
> For the time being we are experimenting with using "script" in many
> "screen" environment which should be able to monitor ipmitool's SoL
> output, but somehow that strikes me as inefficient as well.
>
> So, my question boils down to: How do people solve this problem?
>
> Thanks a lot
>
> Cheers
>
> Carsten
>
>

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977



More information about the Beowulf mailing list