[Beowulf] Themes for a talk on beowulf clustering

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Mon Mar 4 16:06:19 PST 2013


I think the change in scale over the past 10-15 years is interesting, and especially the changes in architecture that result from this.

Going from 8-16 processors to 1000s is a big change.  Bisection bandwidth on your comm fabric.  How do you boot.. 8 processors can be booted sequentially or simultaneously from a server.  For 1000 you need a "better way".  How do you feed files to/from a 1000 processor cluster?

Issues with checkpoint/restart/reliability.  We had a project here at JPL looking at replacing the big 70 meter dishes with an array of, say, 100 6-12 meter dishes.  Replacing the single custom box with lots of a commodity things (6-12 meter antennas are stamped out by the hundreds).  Very Beowulf'y in concept.

Turns out that cryocolers (needed to keep the receiver at a nice toasty 4 Kelvins)  aren't really a mass produced item, and at the observed failure rates, you'd have a hard time keeping enough of them working to do what you needed.  A failure rate of once a month (or something.. I don't know what the actual rates are) on the 70m antenna means you can have a spare and swap it in, and then you basically have a month to fix the broken one.  With 100 antennas and a cryocooler MTBF of 0.5 years, you'll have 4 broken coolers at any given time

The practical differences in experience between assembling a toy cluster of 4-8 processors and simulating it with VM instances on a single machine.  What you learn from the former that you don't get on the latter (the importance of labeling of cables, for instance).



Jim Lux

From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Andrew Holway
Sent: Sunday, March 03, 2013 12:24 AM
To: Bewoulf
Subject: [Beowulf] Themes for a talk on beowulf clustering

Hello all,

I am giving a talk on beowulf clustering to a local lug and was wondering if you had some interesting themes that I could talk about.

ta for now.

Andrew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130305/10b3f95c/attachment.html>


More information about the Beowulf mailing list