[Beowulf] Heterogeneous, intermitent beowulf cluster administration

Lux, Jim (337C) james.p.lux at jpl.nasa.gov
Fri Sep 27 06:45:53 PDT 2013


I'll be contrarian here..
Having done this kind of thing too long ago to admit

As Skylar and Gavin point out, it's a hideously inefficient way to do most
high performance computing tasks. Both in terms of "wall clock time to
complete an amount of work" and in terms of a "dollars spent on SysAdmin
tasks, electricity, etc.)  OTOH, it's a wonderful way to get experience
with *why* serious computing benefits from a more homogenous system.

And you'll learn an incredible amount about configuration management in a
heterogenous environment, as well as the details of transitioning the
machines back and forth between "cluster mode" and "user mode"

You'll learn a lot about network connectivity and the lack thereof.


And, if you have some suitable embarassingly parallel tasks, you can get
them done.

I would start easy.. Set up a system where you have a "queue" of "work to
be done" on some master machine. It could be as simple as a series of
tar/zip files that contain all the information needed to run a job.

Your worker bee nodes go and fetch one of the "jobs" from the list,
explode the archive, run it, and shove an archive back to the master when
done.  The shell script should take care of fetching data from a server
and storing results.

This is pretty easy to set up, and if you have EP work (I was running
multiple cases of the antenna modeling code NEC) it's easy to write a
program to generate the "job files" with systematic variations of the
parameters. Run times for my jobs were in the "hours" range.  I prevailed
on users to start my task when they went home for the night on their
worker bee computers.


Is there an off the shelf solution to this?  I doubt it. It's such a bad
idea in general.  You could look through the literature (and the mailing
list archives) for "cluster of workstations" which is what you are doing.
Maybe even some of the stonesoupercomputer work might be relevant.





On 9/26/13 4:58 PM, "Skylar Thompson" <skylar.thompson at gmail.com> wrote:

>On 09/26/2013 06:25 AM, Gavin W. Burris wrote:
>> Hi, Ivan.
>> 
>> I'm a nay-sayer in this kind of scenario.  I believe your staff time,
>> and the time of your lab users, is too valuable to spend on
>> dual-classing desktop lab machines.
>
>I'm with Gavin here - hardware has gotten too cheap for this to be
>viable in most cases. Furthermore, too many research/computational jobs
>benefit from environments distinct from ideal desktops (whether that's
>core count, RAM, interconnect, or operating system) that you'll probably
>only be satistfying a small subset of the job requests you receive.
>
>Skylar
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
>To change your subscription (digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list