[Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Rahul Nabar rpnabar at gmail.comFri Oct 9 10:17:59 PDT 2009
- Previous message: [Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
- Next message: [Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Oct 8, 2009 at 5:55 PM, Greg Lindahl <lindahl at pbm.com> wrote: > > > 1) Console logging. Your machine just crashed. No clue in > /var/log/messages. "I wonder if it printed something on the console?" > Answer: ipmi and conman (available in an rpm in Red Hat distros). I was "planning" on using kdump and a crash-kernel for that. Note the emphasis on "planning". I never got that working correctly. I got started on kdump+kexec when exactly the same "node crashes for unkown reasons and I have no output" problem. Maybe IPMI gives you the same functionality. Interesting point for me though: What's the pros and cons of IPMI-console-logging versus kdump in such crash scenarios. Are they competitors? Is one better / easier than the other? > 2) Monitoring. Temp, fan speeds, power supply state, events. Answers > the "why is the little red light on the front of the case lit?" > question. You can get some of this via other software (lm_sensors), > but I find ipmitool to suck less, and ipmitool accurately answers the > red light question -- lm_sensors can only guess. I see. Yes, you read me correctly: I was putting full faith in lm_sensors to do this. Currently I have lm_sensors feedign Temperatures to my nagios monitoring setup and has been working fine. But I didn't grasp a practical point about lm_sensors sucking more than IPMI. THat's interesting again: Aren't they taking data from the same bus or counters? Or is this because the sensor details tend to be proprietary so lm_sensors lags behind the Vendor implementations of IPMI? Because if open-source IPMI is also trying to log sensor stats its in competition with open source lm_sensors (not to say this is bad or un heard of for multiple open source projects getting the same thing done!) -- Rahul
- Previous message: [Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
- Next message: [Beowulf] Re: recommendation on crash cart for a cluster room:fullcluster KVM is not an option I suppose?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
