[Beowulf] followup on 1000-node Caltech cluster
sdm900 at gmail.com
Mon Jun 20 08:02:35 PDT 2005
> how much airflow (CFM) do you see from tiles in the front of your
> for 8kW, I'd expect maybe 8-900 CFM, or around 2 modestly-performing
> perforated tiles. I'm reasonably happy with the tiles in my new
> about 600 CFM apiece, and placed 2/rack.
> the rule-of-thumb is 1 ton (3.5 KW) per tile, but that assumes
> older, lower-flow tiles, I think.
I'm not sure how much air flow is coming through the tiles, the tiles
have air ducts cut into them (really old machine room) which
effectively has 1/3 of the tile completely open (slight gill over
it). There is 2 tiles work of "slots" in front of each rack of
compute nodes. About 5 degrees celcius difference between top and
> how long are your rows?
3 rows of 11 racks (well, give or take a rack on each row).
> obviously the hot air will want to rise, but I suppose enough velocity
> will make it go where you want. my machineroom is "half-ducted" as
> downflow chillers, 16" raised floor acting as a cold air plenum, but
> open space above for hot/return air. it's a bit of a risk - it would
> offer a lot more control to have a suspended ceiling close to the top
> of the racks, with the supra-ceiling space acting as a return plenum.
About a 60cm raised floor and about the same amount of head room.
The new system (SGI Altix Bx2 - large beowulf style cluster) has
chilled water radiators(?) in the back of each rack, which prevents
the hot air from the compute nodes hitting the room. Works very very
well. The air coming out the back of the racks is actually colder
than the intake air.
> that sounds surprisingly slow. our older machineroom had only
> about 30KW
> in it, but it was fairly small. when cooling was lost, we went up
> >15 C
> in <5 minutes.
The machine room is very large.
> interestingly, there's no real point to keeping up compute nodes
> via UPS
> unless you also have the chillers+blowers on UPS or automatic
> in fact, all of our new machines (~6K cpus across 4 large clusters)
> UPS-less compute nodes.
Agreed. Only servers and disks on UPS.
Having chillers on UPS can make some real sense. The new system
generates about 400kW of heat (along with about 100kW of other
equipment)... loose the chilled doors at the back of the rack and you
could be in serious trouble. You need to shut those nodes down in a
I think power/air conditioning is a major concern for even small
linux clusters. I've seen some real disasters, even just by putting
a small 20 node cluster into an under-spec machine room. Power the
nodes on and the power circuits were happy... until they started
generating heat (on boot) and the air conditioner kicked in... which
tripped the whole machine room. UPS start squeeling, servers start
crashing... raid trays start going down. Took the group several
weeks to fully recover.
Dr Stuart Midgley
Industry Uptake Program Leader - iVEC
26 Dick Perry Avenue, Technology Park
Kensington WA 6151
Phone: +61 8 6436 8545
Fax: +61 8 6436 8555
Email: stuart.midgley at ivec.org
More information about the Beowulf