[Beowulf] Setting up a new Beowulf cluster

Robert G. Brown rgb at phy.duke.edu
Fri Feb 8 08:11:31 PST 2008


On Thu, 7 Feb 2008, Berkley Starks wrote:

> Hello all,
>
> I've been a computer user for the past several years working in different
> areas of the IT world.  I've recently been commissioned by my university to
> set up the first operating Beowulf Cluster.
>
> I'm am moderately familiar with the Linux OS, having ran it for the past
> several years using the distro's of Debian, Ubuntu, Fedora Core, and
> Mandriva.
>
> With setting up this new cluster I would like any advice possible on what OS
> to use, how to set it up, and any other pertinent information that I might
> need.

This question has been answered on-list in detail a few zillion times.
I'd suggest consulting (in rough order):

   a) The list archives (now that you're a member you can get to them,
although they are digested and googleable for the most part anyway).

   b) Google.  For example, there is a lovely howto here:

     http://www.linux.org/docs/ldp/howto/Parallel-Processing-HOWTO.html

that is remarkably current and a good quick place to start.

   c) Feel free to browse my free online book here:

     http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book.php

I'm working on making it paper-printable via lulu, but I need time I
don't have and so that project languishes a bit.  You "can" get a paper
copy there if you want, but it is pretty much what is on the free
website including the holes.

> Oh, and the cluster will be used for computational physics.  I am a physics
> major making it for the physics department here.  It will need to be able to
> use C++ and Fortran at a bare minimum.

C, C++ and Fortran are all no problem.  The more important questions
are:

   a) How coupled are the parallel tasks?  That is, do you want a cluster
that can run N independent jobs on N independent nodes (where the jobs
don't communicate with each other at all), or do you want a cluster
where the N nodes all do work on a common task as part of one massive
parallel program?  If the former, you're in luck and cluster design is
easy and the cluster purchase will be cheap.

   b) If they are coupled, are the tasks "tightly coupled" so each
subtask can only advance a little bit before communications are required
in order to take the next step?  "Synchronous" so all steps have to be
completed on all nodes before any can advance?  Are the messages really
big (bandwidth limited) or tiny and frequent (latency limited)?

If any of these latter answers are "yes", post a detailed description of
the tasks (as best you can) to get some advice on choosing a network, as
that's the design parameter that is largely controlled by the answers.

    rgb

>
> Thanks again
>

-- 
Robert G. Brown                            Phone(cell): 1-919-280-8443
Duke University Physics Dept, Box 90305
Durham, N.C. 27708-0305
Web: http://www.phy.duke.edu/~rgb
Book of Lilith Website: http://www.phy.duke.edu/~rgb/Lilith/Lilith.php
Lulu Bookstore: http://stores.lulu.com/store.php?fAcctID=877977



More information about the Beowulf mailing list