[Beowulf] Monitoring crashing machines

Tue Sep 9 05:29:15 PDT 2008

Carsten Aulbert wrote:
> Hi all,
>
> I would tend to guess this problem is fairly common and many solutions
> are already in place, so I would like to enquirer about your solutions
> to the problem:
>
> In our large cluster we have certain nodes going down with I/O hard disk
> errors. We have some suspicion about the causes but would like to
> investigate this further. However, the log files don't show much if
> anything at all (which is understandably given that the log files reside
> on disk and we are hitting I/O disk errors). Albeit the console shows
> some interesting messages but cannot scroll back long enough.
>
> My question now, is there a cute little way to gather all the console
> outputs of > 1000 nodes? The nodes don't have physical serial cables
> attached to them - nor do we want to use many concentrators to achieve
> this - but the off-the-shelf Supermicro boxes all have an IPMI card
> installed and SoL works quite ok.
>
> Initially, conserver.com looked nice and we also found an IPMI interface
> for it, but that comes with two downsides: (1) it blocks IPMI access (I
> have yet to find out if a secondary user can use SoL when another user
> is using this already, but I doubt it) and (2) it simply does not catch
> messages appearing in dmesg (simple ones like plugging in a USB
> keyboard), but that may be a configuration problem on our side.
>
> Also we tried (r)syslog but somehow this does not get all the messages
> either, even when using something like *.* @loghost.
>
> For the time being we are experimenting with using "script" in many
> "screen" environment which should be able to monitor ipmitool's SoL
> output, but somehow that strikes me as inefficient as well.
>
> So, my question boils down to: How do people solve this problem?
>
> Thanks a lot
>
> Cheers
>
> Carsten
>
>   
We use conserver here at SiCortex, but it doesn't talk to node consoles
directly.  Instead, we've written a kind
of intermediary between conserver and the real console access.  The
situation isn't exactly
parallel, but if you wind up writing your own "intermediary" the
structure and code might be useful.

Node linux -> custom char device driver -> scan chain hardware ->
embedded uClinux board-level microprocessor
-> "scan daemon", which concentrates the terminals from 27 nodes ->
TCP/IP socket -> x86 service processor
-> "scconserver" which speaks the idiosyncratic terminal protocol on one
side, and demultiplexes the consoles
into invididual TCP sockets -> conserver, which does all the usual
conserver stuff.

This works well enough at the 972 node scale. 

In your situation, the intermediary could export IMPI sockets which it
would multiplex in with its connection
to the real IMPI access on the node. 

We used libevent to write scconserver, which makes all the book-keeping
for a zillion connections fairly
straightforward.  If you head this way, you might get some benefit from
http://downloads.sicortex.com/distfiles/sicortex-scconserver-5.0.0.9.50831.tbz2
All open source.

Regarding dmesg vs console, this is all according to node logging
settings, which I don't know much about.

-- 
-Larry / Sector IX