[Beowulf] cluster building advice?
gus at ldeo.columbia.edu
Mon Sep 17 13:03:03 PDT 2012
On 09/16/2012 05:52 PM, Jeffrey Rossiter wrote:
> Hello everyone!
> I am getting started on a cluster building project at my university. We
> just replaced all of our lab machines so I am going to be using the old
> machines to rebuild our cluster. The intention is for the system to be
> used for scientific computation. I am trying to decide on a linux
> distribution to use. Does it matter all that much? Any advice would be
> greatly appreciated. Book suggestions would help too. I am waiting to
> receive Building Clustered Linux Systems
> by Robert W. Lucke
> but my advisor for the project is concerned that it may be out of date
> for what we are doing. Please share your ideas. Thanks!
> -Jeffrey Rossiter
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
If yours is a computer science project to learn about
clusters, the list archives are worth searching:
On the other hand, if the goals are just to deploy the
cluster quickly, and basically use it for scientific computation,
you can simply use Rocks [although some people may frown at it],
and be up and running in a very short time:
They have decent documentation on the hardware requirements
and how to setup the cluster:
It will use/require a specific Linux distribution, tied to the
Rocks version. [The current Rocks 6.0 uses CentOS 6.2,
replaceable by RHEL 6.2 or Scientific Linux 6.2, IIRR.]
Most system administration tasks are handled [and sometimes
must be handled exclusively] by their "rocks" command,
which some people like, some don't.
Douglas Eadline already pointed out cluster monkey:
and there is also Robert G. Brown's [2004 ?] book:
Other things to think about, since you're cannibalizing
1. How homogeneous is the hardware: All x86, x86_64,
how much memory, disk capacity,
what type of network adapter [100T Ethernet,
Gigabit Ethernet, well Infinband is unlikely
if the machines are old]?
The more homogeneous the machines are,
the easier to cluster them.
2. Network switch
[which depends on the network adapters in your machines]
Do you have a [Ethernet/GigE, other] switch to connect the machines?
Even a SOHO-type switch may work, although with poor performance.
3. Cluster deployment/maintenance [if you don't want to use Rocks]
4. Job scheduler to use:
There are others, mostly commercial.
5. MPI [if you're doing parallel processing - most likely]
OpenMPI [Ethernet, GigE, Infinband, Myrinet, etc]
MPICH2 [Ethernet/GigE, ...]
MVAPICH2 [for Infinband]
I hope this helps,
More information about the Beowulf