[Beowulf] How do people keep track of computers in your cluster(s)?
kilian at stanford.edu
Sun Oct 21 12:14:30 PDT 2007
On Sunday 21 October 2007 07:29:49 Carsten Aulbert wrote:
> However, I would like to have something where we have something like a
> large table about the hardware in question. In there information like
> * vendor
> * serial number
> * MAC addresses (eth0, eth1,..., IPMI, RAID,...)
> * maybe even firmware versions and serial numbers of exchangeable
> internal hardware (hard disks)
> * basically all physical information of the box
We're using Dell OpenManage for this purpose. It's obviously limited to
Dell hardware, but it gather all this information, and makes it available
from a central place if you use their IT Assistant software. It allows to
batch upgrade firmwares and BIOSes, provides a way to gather SNMP traps
and send email alerts, run various reports, and to monitor basic
performance metrics too.
> another table should hold the current setup, i.e. a mapping between the
> hardware and the "logical" setup, e.g.
> Hardware box number #1234 from above table has in the current setup the
> * hostname
> * IP addresses
> * running services
Hostname and IP addresses are available through OMSA too, but running
services are not.
> And finally, another table where special problems, like memory errors
> and the like can be entered.
SNMP and BMC logs are reported to IT Assistant, so you got an instant
notification of hardware errors.
I'm not sure if that fits your environment, but if you own Dell hardware,
it's definitely worth it.
More information about the Beowulf