[Beowulf] 512 nodes Myrinet cluster Challanges

Bill Broadley bill at cse.ucdavis.edu
Tue May 2 14:02:10 PDT 2006

Mark Hahn said:
> moving it, stripped them out as I didn't need them.  (I _do_ always require
> net-IPMI on anything newly purchased.)  I've added more nodes to the cluster

Net-IPMI on all hardware?  Why? Running a second (or 3rd) network isn't
a trivial amount of additional complexity, cables, or cost.  What do
you figure you pay extra on the nodes (many vendors charge to add IPMI,
sun, tyan, supermicro, etc), cables, switches, etc.  As a data point on
a x2100 I bought recently the IPMI card was $150.

Seems like collecting fan speeds and temperatures in-band seems reasonable,
after all much of the data you want to collect isn't available via IPMI
anyways (cpu utilization, memory, disk I/O, etc.).

Upgrading a 208 3phase PDU to a switched PDU seems like it costs on the
order of $30 per node list.  As a side benefit you get easy to query
load per phase.  The management network ends up being just one network
cable per PDU (usually 2-3 per rack).

After dealing with a few clusters with PDUs in the airflow blocking
airflow and physical access to parts of the node I now specify the
zero-u variety that are outside the airflow.

Bill Broadley
Computational Science and Engineering
UC Davis

More information about the Beowulf mailing list