[Beowulf] Re: MS Cray
Robert G. Brown
rgb at phy.duke.edu
Thu Sep 18 07:22:19 PDT 2008
On Wed, 17 Sep 2008, Gus Correa wrote:
> After I configured it with eight dual-slot quad-core Xeon E5472 (3.0GHz)
> compute nodes,
> 2GB/core RAM, IPMI, 12-port DDR IB switch (their smallest),
> MS Windows installed, with one year standard 9-5 support, and onsite
> the price was over $82k.
> It sounds pricey to me, for an 8 node cluster.
> Storage or viz node choices, 24-port IB to connect to other enclosures, etc,
> are even more expensive.
Again, excellently well put. This is literally the bottom line. What
we are really talking about is form factor and who does what. People
usually are pretty careful with their money, at least within their range
of knowledge. When bladed systems first started coming out -- which was
many years ago at this point -- I did a bit of an on-list CBA of them
and concluded that there was a price premium of something like a factor
of 2 for them, compared to the price of an equivalent stack of
rackmounted nodes, more like 3 compared to a shelf full of tower units.
I asked "why would anyone pay that"?
The answer (given by Greg, IIRC but it was long ago and I could be
mistaken about person and maybe the answer) was:
* Cluster people with significant constraints on space, power, or AC.
* businesses that want a turnkey system, typically for HA
applications, that is compact and "easy" to support.
And that is fair enough, actually. Some places one literally has a
closet to put one's cluster in, and if one populates the closet with a
rack one has to work within the current limits on processor (core)
density per U in rackmount configuration. At the time that was
typically two, and a blade chassis could achieve something like 2-3x the
processor density per U. Space renovation can be very expensive, and
sometimes there IS no space for a room full of racks at ANY cost, and
shelved towers are even lower density as they waste a whole lot of space
It sounds like these systems are following similar rules and economics
today. At $10K/node, it sounds like it isn't really 3x as expensive as
a rackmount cluster (with similar power) any more -- perhaps 2x as
expensive or even a bit less. If one counts it as a "64-core cluster"
it sounds a lot better than an "8-node cluster", and it also frees one
to at least consider 16 node solutions with single quad cores to reach
the same core count, that might well be cheaper (although networking
might make that not so, but where splitting up the cores provides a lot
more BW per core, so it might still be a good idea).
Still, having a 64 node cluster that fits in a few U or in a box I could
put in the corner of my desk is definitely appealing, even though the
price is for me exorbitant. A final question remains, though -- it
isn't just size (volume) -- it is infrastructure in general. How much
power does this puppy require? How much AC?
Not at all an idle question. Penguin's rackmount dual-quad nodes come
with 600 W power supplies! If an 8-core node draws a not unreasonable
400W under load, 8 nodes in a bladed chassis will ALSO draw at least 3
KW! Ouch! That's hot!
With this guestimate we have clearly exceeded the capacity of a standard
20 amp 120VAC circuit, the kind one MIGHT actually find in an office.
Even a 200W/node box is clearly beyond the capacity of a 15 amp circuit.
Big problem. My office actually doesn't have a plug or service capable
of providing 3 KW at any voltage. And it might well need 4 KW if node
draw under load is closer to 500W!
Then there is the AC problem. My office isn't too big -- maybe 60-70 sf
-- and of course it doesn't have its own thermostat or climate control
-- it is temperature controlled by a large building-scale system that
generated conditioned airflow to and between all the rooms on my hall.
It stays perfectly comfortable with a variable load of perhaps 400-700 W
of power produced within its walls -- that would be me (100W), lights
(100W), a desktop (150W), a laptop (50W), and a bit of this and that,
including a few students from time to time at 100W each and THEIR
laptops as well, come to think of it. So it could have as much as a KW
peak heat load, but it's a big building, there is some "ballast" and
averaging from neighboring offices, and it doesn't really get hot even
when we party in their.
Now throw that extra 3 KW in. Suddenly the MINIMUM load is over 3x the
PEAK load now. My office has no thermostat -- I just have to live with
what the building delivers. Room temperature soars. rgb has to start
going to work in shorts and sandals, no shirt, and installs a keg
refrigerator to hold the beer essential to life that must be piped
directly to mouth to replace the sweat and dull the pain of working
under such extremely hot conditions. Only for a short while, of course
-- cluster dies a horrible death, as it isn't designed to run at 110F.
And then along comes winter, and Duke, foolishly assuming that it is
cold outside and people need HEAT and not COLD delivered to their
offices, shuts off the building chiller and turns on the steam.
Obviously I wouldn't need the steam, but EVEN IN THE WINTERTIME I won't
be able to get rid of 3500-4500 watts. I'd need a TWO-ton (well, I
might get away with a 1.5 ton) AC unit of my very own to remove this
heat and dump it -- somewhere -- and I'd have to control its thermostat.
Fine. One can, actually buy 1 ton to 1.5 ton AC units that sit over in
a corner and vent heat into e.g. the space inside a drop ceiling (where
eventually building AC/climate control sucks it up and re-air-conditions
it away). They draw a few hundred watts of their own, need a place to
dump water from condensation (or a human has to hand-empty their
condensation pan), and of course they are loud and annoying, but they
work, when somebody doesn't kick the drain tube out so that they leak
all over the floor or the ceiling space gets so hot that hot air back up
into the room anyway.
So, to run one of these "office clusters" in the REAL world, it sounds
like I'd probably need the following:
a) An additional 4 KW-capable circuit in my office. Concrete walls,
I'm guessing fully subscribed service, let's be optimistic and add just
$1K to get it. That leaves me my existing 20 A service to run the AC
on; tight but doable -- or of course I can get a couple of high-power
circuits while I'm at it, one for the system and one for its AC.
b) A 1.5 ton AC as described. There goes 1 m^2 of floor space, hmmm,
where the hell am I going to get THAT? A filing cabinet must go, or
students will just have to sit on it. The unit costs perhaps $3-5K,
depending on new or used and condition etc, and it still needs a bit of
installation beyond just buying it from a catalog. Oops, and my office
doesn't, actually have drop ceilings. Need to vent it outdoors, need
hole in wall for vent, need at least $1K to pay the nice man to put it
there and make it all pretty. Double oops, that needs outside wall,
guess I have to lose a bookshelf intead of a filing cabinet, stack books
to ceiling on top of filing cabinets that remain.
c) A drain. Another hole. No bathroom nearby, I guess we'll have to
just drill a different wall and dump. Hmmm, I wonder if local codes --
or Duke -- will let me do that? A perpetual drip into an interior
"chimney" airvent next to the stairwell has nowhere to go. One out the
other window, where will that water go? Must check, but either way
another $500 for sure (and I'll have to move completely out of the
office while they chop holes and generate all that dust, brick and
concrete walls, very difficult to drill).
So, after budgeting around $90K -- or even more -- for actually putting
my cute little 64 core system, after moving out of my office while power
and AC are retrofitted, after moving back into my office and clearing
room on top of a filing cabinet or on a corner of my desk for the blade
chassis, I move back in, crank the AC up to max (that's what it will run
on nearly all the time, max) and power up my little sucker. Its cooling
fans -- capable of moving that 3+ KW out of its small volume -- kick on
and add to the soothing, um, "purr" of the AC, and I realize that I'm
going to be living at 40 dB or more of steady-state background noise
forever, maybe even 50 by the time my other hardware kicks in. Students
have to speak loudly to overcome the noise, I take to wearing
noise-cancellation headphones all the time even when I'm not listening
to music in my room just to protect my hearing. But I've got a hell of
a cluster, and it is in my office, and I snuck that keg refrigerator
into the budget by arguing that the cluster was "liquid cooled"!
OR, since we already HAVE a lovely server/cluster room downstairs, with
a large AC unit built right in that keeps the room COLD, and lots of
power-poles just waiting to deliver power and trivially rewireable to
deliver e.g. 209 VAC at 30A and no humans who have to listen to the
60-80 dB background roar except for brief intervals doing service, I
COULD just buy a stack of 1U sufficient to give me 64 cores with
2GB/core and high speed networking, in the cheapest (per core) rackmount
form factor. I don't even have to buy rack space, as we have that in
abundance. Pricing out reasonably comparable 1U nodes from e.g. Penguin
it looks like they would cost "around" $6000 with the usual service etc
-- around $750/core instead of $1100/core (including the infrastructure
mods). The cluster costs me a bit over 1/2 as much -- $50K, say --
although I'm not actually pricing it out and if I shopped harder it
might be even less, or a bit more if I forgot something in my estimate.
Now, I don't have to listen to it. I don't have to rewire my room and
give up a table, desk space, etc. I have network access in my office
(obviously) -- multiple ports, actually. If I feel really ambitious I
spend a little of the money I save arranging to pipe one of the ports
directly into my cluster down there so I can make my desktop a kind of
"head node" for the cluster, although in practice I've never done that,
why bother? GigE access on our standard LAN seems adequate, somehow.
With the money I save I buy, lessee, at least 48 more cores! Wow, I get
my computations done in a lot less time! Wow, with with 112 3 GHz cores
total, I actually have capacity to spare, and can even loan cores out to
my buddies who are CPU starved, which inspires the University to pay the
nontrivial cost of providing power and AC to all of these CPUs! The
cost savings of MY TIME -- well, it's hard to compute, but simply taking
the salary I would need to be paid while all that work takes well over
half again as long to complete and feeding even the savings on my meager
salary back into still more nodes, the break-even point on the CBA
exercise would probably leave me with WELL OVER 2x the 64 core capacity
of the "cute" desktop cluster model.
This little exercise in the realities of infrastructure planning exposes
the fallacy of the "desktop cluster" in MOST office environments,
including research miniclusters in a lot of University settings. There
exist spaces -- perhaps big labs, with their own dedicated climate
control and lots of power -- where one could indeed plug right in and
run, but your typical office or cubicle is not one of them. Those same
spaces have room for racks, of course, if they have room for a high
density blade chassis.
If you already have, or commit to building, an infrastructure space with
rack room, real AC, real power, you have to look SERIOUSLY at whether
you want to pay the price premium for small-form factor solutions. But
that premium is a lot smaller than it was eight or so years ago, and
there ARE places with that proverbial broom closet or office that is the
ONLY place one can put a cluster. For them, even with the relatively
minor renovations needed to handle 3-4 KW in a small space, it might
well be worth it.
Robert G. Brown Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977
More information about the Beowulf