[Beowulf] Re: RRDtools graphs of temp from IPMI
d.love at liverpool.ac.uk
Tue Nov 11 06:41:06 PST 2008
Chris Samuel <csamuel at vpac.org> writes:
> The reason it worries about high load is that we
> used to see processes hang trying to read from the
> IPMI device, but haven't seen that with more recent
How recent? We've seen similar trouble on Supermicros with a SuSE 10.3
(126.96.36.199) kernel, hence doing it out-of-band, as I just posted.
(Sorry I basically duplicated the in-band one of yours.) It involves
the kipmi0 kernel thread going CPU-bound and sometimes getting a huge
load average from failed ipmitool instances hanging around.
By the way, the IPMI temperature sensors don't work on our
H8DCE-HTE/AOC-IPMI20-E Supermicros, although lmsensors does work. Does
anyone know a fix for that?
More information about the Beowulf