[Beowulf] The Walmart Compute Node?
jimlux at jpl.nasa.gov
jimlux at jpl.nasa.gov
Fri Nov 9 10:40:04 PST 2007
Quoting "Jeffrey B. Layton" <laytonjb at charter.net>, on Sat 10 Nov 2007
08:49:01 AM PST:
> andrew holway wrote:
>> Sod all this tin pot stuff.
>> Buying all this crap, sticking it in a rack and stringing it together
>> with wire aint difficult. Making the damn software work is the tricky
>> Get loads of ram, vmware-server and BINGO! you have a cluster!
> But this isn't a cluster - it's enterprise masturbation. We're talking about
> HPC, not running payrolls on a server. It's all about performance.
> Running a bunch of VM instances on a server is not really HPC to me.
> Of course, it's a GREAT way to learn and I know a bunch of people
> who use for testing and development.
And, in fact, I contend that it's the grubby aspects of stringing
wires, making netboot or sneakernet distribution work and so forth
that is what future cluster builders desperately need practice with.
If your interest is parallel algorithm design, then multiple VMs is a
If your interest is understanding the practicalities of cluster
engineering, then a stack of 50 very cheap boxes might be a suitable
playground for learning by ordeal.
Say you have a class in cluster engineering with, say, 20-30 students.
You make up groups of 2-3 bodies (so they can learn social skills, if
nothing else), and give each group a crate with 8 boxes with freshly
wiped disks plus one head node and a box full of power cords, network
cables, VGA cables, keyboards, all thrown in there by last semester's
groups. There will, of course, be 9 power cords in some crates and 7
Have them build up a cluster and run some trivial demo. There will be
much learning, just getting a bootable image on all 8 machines (some
might go the PXEboot route, some might sneakernet).
Then, tell them they have to gang all 80 machines into two clusters,
each one with 40 machines and install a new OS. Hand configuration
management and sneakernet will be painful. Then, have them swap 20 of
the machines between clusters, do the same. CM by hand and sneakernet
is even MORE painful. Heck, they can start to understand the
differences between parallelism on machines and parallelism in bodies
(ok, Bob, you put the boot CD in machines 1-5, Fred, you do 6-10, Ann,
11-15, etc.). If they're all running from one networked file server,
they'll also learn empirically why you don't want them all to boot
from the network simultaneously.
If you've given them 5 port cheap switches, they'll also get to learn
about multi tier networking toplogies.
That'll larn 'em....
(If anyone decides to do this, let me know... I'd love to watch.. I'll
even bring suitable frosty beverages for the spectators)
More information about the Beowulf