[Beowulf] power usage, Intel 5160 vs. AMD 2216

Bruce Allen ballen at gravity.phys.uwm.edu
Fri Jul 13 01:41:40 PDT 2007


Hi Mark,

>> It's my experience that when building large clusters, issues of space, 
>> power and cooling are often harder and more time-consuming to resolve 
>> than actually getting the cluster itself purchased, commissioned, and 
>> operating. For
>
> that is somewhat perplexing, since the space/power/cooling issues aren't 
> really _that_ complicated.  I think it's one of those areas where too much
> choice leads to harder decisions.

Over the past three years, I've been closely involved in the construction 
of two large cluster rooms.  (400kW and 500kW).  Once is conventional 
raised floor air cooling and the second is water cooled racks.  In both 
cases, they were institutional construction projects, and took 
substantially more than a year.

In the past, I have done remodelling efforts at the 40kW level. This, I 
agree, is something that can be done in a matter of a couple of months. 
But when scaled up by a factor of ten it's substantially more difficult.

> perhaps it also reflects the fact that we're still not really 
> comfortable with the state of affairs - for instance, vendors still 
> advocate blade servers, which if fully populated are basically 
> uncoolable (~24 KW/rack!).

With good design (water cooled racks) you can cool a 24kW system.

>> example I've recently taken up a new position in Hannover Germany where as 
>> part of my start-up package the MPG is building a cluster room (450 square 
>> meters floor space, 500kW cooling, 800kW UPS, with the option to double 
>> cooling/power in four years).
>
> those numbers seem strange to me - unless I've botched conversion, the 
> cluster I sit next to is about 4.7 KW/sq-m.

In my case, it turns out that the most cost-effective systems are LOW 
density ones.  So the room has been designed to accomodate these, with a 
lower kW/m^2 value.  And the physical space itself was 'free' so it's nice 
to have some elbow room.

> (such a large UPS seems strange too - did they choose it based on poor 
> quality line power? we have none of our compute hardware on UPS, and 
> don't have problems,

I have had power-related problems in the past, and have found that the 
lower maintenance needs and higher reliability of UPS backed systems are 
worth it.

The room is being designed with a 20-year lifetime.  When amortized over 
that time (at least 6 or 7 clusters) the one-time cost of the UPS is 
negligable.

> since modern PDUs seem to ride out the typical 1-second glitch without 
> much trouble...)

That's interesting.  Where does the PDU store 1 second of power?

>> end of this year. So total design and construction time is 2.3 years. In

> that's a bit extreme, I think.  our room was a bare-slab reno and took a 
> bit over a year.

That's fast!

> another one of our sites was built from scratch and took about 1.5.

Fair enough.  From my experience this 1.5 years from scratch is probably 
typical.

>> construction time will be about 0.5 years.  The cost of the cluster room is 
>> about equal to the cost of the initial cluster that will go into it.  But 
>> the

> strange.  I'm pretty sure the cost ratio we see is more like 4:1 for the 
> from-scratch site (and closer to 10:1 for renos.)

When I built a 40kW system (rennovation) the ratio was 10:1.  In other 
words the room rennovation cost was 10% of the cluster cost.  In that case 
the construction was done by a University 'Physical Plant' as a 
remodelling effort.  No management or controls.

When I built a 400kW system (bare room) the cost ratio was about 2:1.  In 
that case the construction was done by a state agency, and there were a 
lot of management and process requirement costs. This did result in a 
higher-quality room but I think that it probably doubled the construction 
costs.

The room that I am building now would *also* be about 2:1, except for the 
fact that we are using liquid cooled racks (we have a very low ceiling). 
This effectively doubles the cost, hence the 1:1 ratio.

Fortunately, over the 20 year lifetime that the room will be in use, I 
expect that at least six generations of cluster will go into it.  So in 
the end the ratio should be at least 6:1.

Cheers,
 	Bruce



More information about the Beowulf mailing list