[Beowulf] Monitoring crashing machines

Perry E. Metzger perry at piermont.com
Tue Sep 9 04:45:46 PDT 2008


Carsten Aulbert <carsten.aulbert at aei.mpg.de> writes:
> For the time being we are experimenting with using "script" in many
> "screen" environment which should be able to monitor ipmitool's SoL
> output, but somehow that strikes me as inefficient as well.

First, you should probably never want script+screen -- use expect
instead. It's the swiss army chainsaw of sysadmin tools.

Second, I suspect you can use ipmish or a similar tool to simply "cat
out" the console output and just redirect it to a file.

> Initially, conserver.com looked nice and we also found an IPMI interface
> for it, but that comes with two downsides: (1) it blocks IPMI access (I
> have yet to find out if a secondary user can use SoL when another user
> is using this already, but I doubt it)

The whole point of conserver is that *conserver* allows multiple users
to connect up at once. They connect up to conserver, not via ipmi, and
conserver multiplexes the one IPMI connection.

> and (2) it simply does not catch messages appearing in dmesg (simple
> ones like plugging in a USB keyboard), but that may be a
> configuration problem on our side.

Probably.

-- 
Perry E. Metzger		perry at piermont.com



More information about the Beowulf mailing list