[Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
landman at scalableinformatics.com
Sat Oct 3 10:13:58 PDT 2009
Rahul Nabar wrote:
> On Sat, Oct 3, 2009 at 11:54 AM, Joe Landman
> <landman at scalableinformatics.com> wrote:
>> If I were building a cluster of anything more than 4 machines (not racks,
>> machines), I would be insisting upon IPMI 2.0 with a working SOL and kvm
>> over IP capability built in.
> Thanks for those tips Joe. I am already convinced by all the posts on
> the list that IPMI is a must. No other way. All you guys seem pretty
> unanimous about that much!
>> For the 250-300 machine system you are looking at, you *want* IPMI 2.0 with
>> KVM over IP. You *want* switched remotely accessible PDUs, for those times
>> when IPMI itself gets wedged (rarer these days, but it does still happen).
>> IMO you *want* this IPMI on a separate network. You *want* a serial
>> concentrator type system to provide a redundant path in the event of an IPMI
>> failure. Problems don't go away just because IPMI stopped working. You
>> *need* an inexpensive crash cart that just works, and plugs into your PDUs.
> I see, thanks for disabusing me of my notion of "ipmi" as one
> monolithic all-or-none creature. From what you write (and my online
> reading) it seems there are several discrete parts:
> IMPI 2.0
> switched remotely accessible PDUs
> "serial concentrator type system "
> Correct me if I am wrong but these are all "options" and varying
> vendors and implementations will offer parts or all or none of these?
> Or is it that when one says "IPMI 2" it includes all these features. I
IPMI 2.0 includes
* local power control (on-off switch in software)
* system sensor inspection
It *may* contain kvm over IP (the clusters we build do).
> did read online but these implementation seem vendor specific so its
> hard to translate jargon across vendors. e.g. for Dell they are called
> DRAC's etc.
IPMI 2.0 at minimum is a must. DRAC has levels which also provide kvm
over IP, though at additional cost.
> Finally, what's a"serial concentrator"? Isn't that the same as the
> SOL that Skylar was explaining to me? Or is that something different
Something different. A serial concentrator is a machine you can ssh
into providing N serial ports. It is different than the IPMI SOL
capability. It is a second non-IPMI management channel. For large
systems, I'd recommend multiple administrative paths ...
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web : http://scalableinformatics.com
phone: +1 734 786 8423 x121
fax : +1 866 888 3112
cell : +1 734 612 4615
More information about the Beowulf