disadvantages of a linux cluster
Robert G. Brown
rgb at phy.duke.edu
Tue Nov 12 05:59:35 PST 2002
On Tue, 12 Nov 2002, Guy Coates wrote:
> > Just for the record, how much did this cluster cost? Or at least, how
> > much does a 3U with 24 blades cost?
> Can't comment on how much we paid I'm afraid, but list price for a 800Mhz
> PIII blade is $1,249 (you have to figure in the cost of the chassis as
> well, the price of which does not seem to be on rlx's website).
> Performance wise, linpack pulls about ~550 Mflops on a single blade.
> (dmesg reports 1592.52 BogoMIPS per CPU). Disk IO isn't great (ide disks)
> but there are two of them per blade so we use RAID-0 which helps. The
> network is only 100BaseT which is not good if you run MPI (we don't) but
> there are 3 interfaces per card which allows us to run a slightly strange
> network topology in order to move large datafiles onto the blades in a
> sensible amount of time.
Can one elect to omit one or both disks? Ethernet interface(s)? Disks
and network interfaces can both be passive power consumers and a linux
image for nodes occupies at most a couple of GB and as little as a few
tens of MB. These blades sound like they were born to run diskless in a
compute farm for EP tasks (where one wouldn't really need multiple
network interfaces if the task was scaled to run for times much longer
than startup and data collection), and if one could knock a few hundred
off the price per blade, would come in at order of $1/MHz, very
comparable to lintel.
One does have to worry a bit about getting off the beaten COTS track --
single source vendor, price to be negotiated with same, single source of
service -- but Myrinet sets a long precedent and it does sound like a
> >So you don't seem to be getting a lot more MHz/Watt,
> Probably not too surprising, as the CPU and disks are going to be the
> major power draw, and they are bog-standard PC parts, same as in every
> other Lintel cluster.
Ah, I thought you had the transmeta blade. Well, that simplifies
guestimating and comparing P6-family performance and addresses at least
part of the COTS issue as well:-) One wonders why they can't sell the
blades with 1.4 GHz PIII's? They must be using the low voltage/power
chips, although I find it a bit hard to get full technical specs from
their website. Looks like they really want to sell "standard" blades
without a lot of configuration choices (not unreasonably).
> The blades are very easy to manage. There are no user-serviceable parts on
> a blade. So if a disk or CPU dies we pull the whole blade and replace with
> a new one. Whether this is a good thing or not depends on how well you
> get on with your vendor and the T&Cs of your service agreement:). RLX have
> some nifty blade management software which we use to provision OSs, look
> at hardware health, get serial consoles etc.
Do the NICs do PXE? Can they run diskless? RLX is clearly selling
webfarms and server farms with beowulfery an important but secondary
sales target, but they might consider selling a stripped blade
engineered as a pure compute node. It would save a bunch of power, too
-- disks and unnecessary NICS draw wattage, probably close to half the
total draw of the boards. At (say) 500-600W per 3U your power/heat
problem is fairly significantly reduced, and the lifetime of the boards
at a lower operating temperature extended.
BTW, the chassis price from their website was around $3K, making a fully
populated 24 blade unit with ~$1500 boards about $40K, about $2 per
aggregate P6 MHz. A dual Xeon with two 2.4 GHz P4's is about $2000
(depending on configuration -- probably with a lot more memory) or a bit
less than $1/MHz (although it also underperforms a PIII relative to
clock in a some applications, including mine).
If RLX could find a way of selling stripped, bigger memory but lower
power compute blades for <$1000 they'd be very cost competitive. Of
course, I suppose that this is a matter of dickering with them;-)
Thanks, this has been very informative. I was never convinced by the
"power density" argument the last time blade computers were brought up
on the list, as the issue isn't power per chip, it is power per MHz and
given roughly constant switching power requirements at a given VLSI
scale, one doesn't expect a tremendous difference in power consumption
between 3 800 MHz P6's and one 2.4 GHz P6. A fully loaded 40U rack
(allowing 4U of space at the top for patch panels or switches) can hold
12 3U boxes, or 10+ KW of power. That is, umm, HOT -- a 4 ton A/C with
massive airflow can just be attached to the front of the rack, thank you
I am a lot more likely to be convinced by ease of installation and
management issues, FLOPS rack densities, and long term reliability. It
looks like your cluster does quite well on the first ones, and the last
one remains to be proven in application.
> Guy Coates
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf