[Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caSat Oct 3 14:01:42 PDT 2009
- Previous message: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
- Next message: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> monolithic all-or-none creature. From what you write (and my online > reading) it seems there are several discrete parts: > > IMPI 2.0 > switched remotely accessible PDUs > "serial concentrator type system " I think Joe was going a bit belt-and-suspenders-and-suspenders here. ipmi normally provides out-of-band access to the system's I2C bus (which lets one power on/off, reset, and read the sensors.) it also normally provides some form of console access: usually this is by serial redirection (serial output can be redirected through the BMC and onto the net). independent of this (but usually also provided) is a bios feature which scrapes the video character array onto serial, thus giving access to bios output (and also technically independent but also provided is lan->bmc->serial->bios "keyboard" input.) some people also configure systems with network-aware PDUs (power bars): APC is a common provider of these, and they provide a backup if IPMI doesn't work for some reason (network problems, hung BMC, etc). I do not personally think they are worthwhile because I rarely see IPMI problems - admittedly perhaps due to the fairly narrow range of parts my organization has. smart PDUs sometimes also provide power montoring, which might be useful, though I would actually prefer to see IPMI merely provide current sensors via I2C (in addition to volts). (having both socket power and motherboard power might be amusing, though, since you could calculate your PSU's efficiency - potentially even its load-efficiency curve. most vendors now quote 92-93% efficiency, but it's unclear what load range that's for...) finally, I think Joe is advocating another layer of backups - serial concentrators that would connect to the console serial port on each node to collect output if IPMI SOL isn't working. this is perhaps a matter of taste, but I don't find this terribly useful. I thought it would be for my first cluster, but never actually set it up. but again, that's because IPMI works well in my experience. I think Joe's right in the sense that you _don't_ want a cluster without working power control, and working post/console redirection is pretty valuable as well. both become more critical with larger cluster sizes, mainly because the chances grow of hitting a problem where you need power/reset/console control. whether you need backup systems past IPMI is unclear - depends on whether your IPMI works well. > Correct me if I am wrong but these are all "options" and varying > vendors and implementations will offer parts or all or none of these? > Or is it that when one says "IPMI 2" it includes all these features. I I interpreted Joe as saying that you need IPMI2 (remote power/reset/console) as well as backup mechanisms for IPMI failures. > hard to translate jargon across vendors. e.g. for Dell they are called > DRAC's etc. vendors provide IPMI features, usually with added proprietary nonsense. sometimes they sacrifice parts of IPMI in favor of the proprietary crap... > Finally, what's a"serial concentrator"? Isn't that the same as the > SOL that Skylar was explaining to me? Or is that something different > too? a network-accessible box into which many serial ports plug. some let you transform a serial port into a syslog stream, for instance.
- Previous message: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
- Next message: [Beowulf] recommendation on crash cart for a cluster room: full cluster KVM is not an option I suppose?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
