[Beowulf] Some beginner's questions on cluster setup

P.R. romero619 at hotmail.com
Wed Jul 8 11:59:22 PDT 2009


Hi,
Im new to the list & also to cluster technology in general.
Im planning on building a small 20+node cluster, and I have some basic
questions.
We're planning on running 5-6 motherboards with quad-core amd 3.0GHz
phenoms, and 4GB of RAM per node.
Off the bat, does this sound like a reasonable setup

My first question is about node file&operating systems:
I'd like to go with a diskless setup, preferably using an NFS root for each
node.
However, based on some of the testing Ive done, running the nodes off of the
NFS share(s) has turned out to be rather slow & quirky.
Our master node will be running on a completely different hardware setup
than the slaves, so I *believe* it will make it more complicated & tedious
to setup&update the nfsroots for all of the nodes (since its not simply a
matter of 'cloning' the master's setup&config). 
Is there any truth to this, am I way off?

Can anyone provide any general advice or feedback on how to best setup a
diskless node?


The alternative that I was considering was using (4GB?) USB flash drives to
drive a full-blown,local OS install on each node...
Q: does anyone have experience running a node off of a usb flash drive?
If so, what are some of the pros/cons/issues associated with this type of
setup?


My next question(s) is regarding network setup.
Each motherboard has an integrated gigabit nic.

Q: should we be running 2 gigabit NICs per motherboard instead of one?
Is there a 'rule-of-thumb' when it comes to sizing the network requirements?
(i.e.,'one NIC per 1-2 processor cores'...)


Also, we were planning on plugging EVERYTHING into one big (unmanaged)
gigabit switch.
However, I read somewhere on the net where another cluster was physically
separating NFS & MPI traffic on two separate gigabit switches.
Any thoughts as to whether we should implement two switches, or should we be
ok with only 1 switch?


Notes:
The application we'll be running is NOAA's wavewatch3, in case anyone has
any experience with it.
It will utilize a fair amount of NFS traffic (each node must read a common
set of data at periodic intervals), 
and I *believe* that the MPI traffic is not extremely heavy or constant 
(i.e., nodes do large amounts of independent processing before sending
results back to master).


Id appreciate any help or feedback anyone would be willing&able to offer...

Thanks,
P.Romero




More information about the Beowulf mailing list