[Beowulf] RE: Capitalization Rates - How often should you replace a cluster? (resent - 1st sending wasn't posted ).

Mark Hahn hahn at mcmaster.ca
Wed Jan 21 13:24:13 PST 2009

>> I guess our premises are quite different.
> Yes, they are. Hence your statement about "obviously burning money"
> was a bit of an over-generalization ;)

well, it was in answer to the OP, who seemed to be asking about HPC
clusters, not (apparently) the search engine web services you do
(very large BW costs, and apparent need to be located in expensive surrounds,
and yet with fairly cheap nodes.)

one useful thing came out of this thread, which is that in a lot of 
places, power is dramatically more expensive than other places (ie, >4x).

so whether fast replacement is money-burning actually depends on both
high energy costs, but also whether new hardware is significantly more 
power-efficient.  I guess I'm still a little surprised at your situation - 
how has your node-power vs search-performance figure changed recently?
flops/W "obviously" has, but I would have guessed search stuff to be 
disk or memory-dominated.  (both of which have certainly improved over
the past few years, but much more gradually, no?)

> There are HPC systems that want UPSes, namely ones that do
> mission-critical things, like weather forecasts, and spooky work.
> Most university HPC systems don't fall into that category. Most

it may be of interest that we see 1-2 unplanned power outage/year
for most of our sites.  though admittedly we also have a couple smaller
sites where it's more like 30/year.

> my cluster looks pretty much like an HPC cluster, albeit one with
> only gbit ethernet and lots of local disk.

ah!  so it appears that disks have improved their active and idle power 
figures by a factor of ~2 over the past year or so, after having remained
fairly steady before that.  is that right?  I'm only talking about what
I think of as the mass market: 3.5" sata.  or do you use 2.5" laptopy disks?
still only ~10W/disk - I would expect MTBF/AFR/warranty-based considerations
would more drive your replacement cycle...

More information about the Beowulf mailing list