[Beowulf] 512 nodes Myrinet cluster Challanges
mwill at penguincomputing.com
Tue May 2 14:11:26 PDT 2006
IPMI nowadays comes for free on the mainboard, and if you don't want to
run a separate
infrastructure for the light weight control traffic, then you don't even
need to add
ports/cables/switches. In case of a scyld beowulf cluster the compute
nodes are on their
own private network switch anyways so it is most convenient to reboot
them through that
primary cluster interface.
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of Bill Broadley
Sent: Tuesday, May 02, 2006 2:02 PM
To: Mark Hahn
Cc: beowulf at beowulf.org
Subject: Re: [Beowulf] 512 nodes Myrinet cluster Challanges
Mark Hahn said:
> moving it, stripped them out as I didn't need them. (I _do_ always
> require net-IPMI on anything newly purchased.) I've added more nodes
> to the cluster
Net-IPMI on all hardware? Why? Running a second (or 3rd) network isn't
a trivial amount of additional complexity, cables, or cost. What do you
figure you pay extra on the nodes (many vendors charge to add IPMI, sun,
tyan, supermicro, etc), cables, switches, etc. As a data point on a
x2100 I bought recently the IPMI card was $150.
Seems like collecting fan speeds and temperatures in-band seems
reasonable, after all much of the data you want to collect isn't
available via IPMI anyways (cpu utilization, memory, disk I/O, etc.).
Upgrading a 208 3phase PDU to a switched PDU seems like it costs on the
order of $30 per node list. As a side benefit you get easy to query
load per phase. The management network ends up being just one network
cable per PDU (usually 2-3 per rack).
After dealing with a few clusters with PDUs in the airflow blocking
airflow and physical access to parts of the node I now specify the
zero-u variety that are outside the airflow.
Computational Science and Engineering
Beowulf mailing list, Beowulf at beowulf.org To change your subscription
(digest mode or unsubscribe) visit
More information about the Beowulf