Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] 512 nodes Myrinet cluster Challanges

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at physics.mcmaster.ca
Sun Apr 30 09:42:57 PDT 2006


> > By the way, the idea of rolling-your-own hardware on a large cluster, and
> > planning on having a small technical team, makes me shiver in horror.  If
> > you go that route, you better have *lots* of experience in clusters. and
> > make very good decisions about cluster components and management methods.
> > If you don't, your users will suffer mightily, which means you will suffer
> > mightily too.

I believe that overstates the case significantly.

some clusters are just plain easy.  it's entirely possible to buy a
significant number of conservative compute nodes, toss them onto a generic
switch or two, and run the whole thing for a couple years without any real
effort.  I did it, and while I have a lot of experience, I didn't apply any
deep voodoo for the cluster I'm thinking of.  it started out with a good 
solid login/file/boot server (4U, 6x scsi, dual-xeon 2.4, 1G ram), a single
48pt 100bt (1G up) switch, and 48 dual-xeon nodes (diskful but not
disk-booting).  it was a delight to install, maintain and manage.
I originally built it with APC controllable PDUs, but in the process of 
moving it, stripped them out as I didn't need them.  (I _do_ always require
net-IPMI on anything newly purchased.)  I've added more nodes to the cluster
since then - dual-opteron nodes and a couple GE switches.

> For clusters with more than perhaps 16 nodes, or EVEN 32 if you're
> feeling masochistic and inclined to heartache:

with all respect to rgb, I don't think size is a primary factor in cluster 
building/maintaining/etc effort.  certainly it does eventually become a
concern, but that's primarily a statistical result of MTBF/nnodes.  it's
quite possible to choose hardware to maximize MTBF and configuration risk.

in the cluster above, I choose a chassis (AIC) which has a large centrifugal
blower, rather than a bunch of 40mm axial/muffin fans.  a much larger cluster
I'm working on now (768 nodes) has 14 40mm muffin fans in each node!  while
I know I can rely on the vendor (HP) to replace failures promptly and without
complaint, there's an interesting side-effect: power dissipation.  of 12 fans
pointing at the CPUs are actually paired inline, and each pair is rated to 
dissipate up to 20W.  so a node that idles at 210W and 265W under full load
can easily consume 340W if the fans are ramped up.  ouch!

this is probably the most significant size-dependent factor for me.  if
you're doing your own 32-node cluster, it's pretty easy to manage the
cooling.  the difference between dissipating 300 and 400W is less than
a ton of chiller capacity.  scraping up 10-20 additional tons of capacity
is quite a different proposition.

regards, mark hahn.





More information about the Beowulf mailing list