Questions and Sanity Check
keithu at parl.clemson.edu
Tue Feb 27 06:48:59 PST 2001
I would use larger hard drives. The incremental cost from 10GB to 30GB
should be pretty small and you may one day appreciate that space if you
use something like PVFS. I would also consider a gigabit uplink to the
head node if you are going to use Scyld. It drastically improved our
cluster booting time to have a faster link to the head.
On 26 Feb 2001, Ray Jones wrote:
> I'm involved in a group putting together a proposal for building a
> Beowulf for our research lab (www.merl.com). We feel like we've
> achieved a reasonable level of confidence in our design, but I wanted
> to run it past the list for a final sanity check, as well as tack on a
> few questions that we still need to answer.
> Hardware configuration questions:
> Our proposed system:
> 128 nodes - 64 RS-1200 servers from Racksaver
> Each node:
> 1 GHz AMD Thunderbird
> 10 GB IDE hard drive
> 512 MB CS2 memory
> Intel Etherexpress 8460 NIC
> D-Link DES-6000 w/ 8 6003 16-port blades
> OS: Scyld (most likely)
> I realize that it's an ill-formed question, but does anyone see
> anything horribly wrong with the above?
> Fuzzier, cost of ownership questions:
> We have about 10 researchers that would be interested in using the
> system. They almost exclusively into two categories:
> - Matlab users
> - Users with embarassingly parallel problems (tree search, graphics
> rendering, ...)
> For the Matlab users, we plan to use Matlab*p (aka MITMATLAB, aka
> Parallel Problems Server) to provide them access to the system. The
> others will probably receive a bit of an introduction to MPI and a bit
> of handholding while they get used to running parallel batch jobs.
> How much they'll need is one of the questions below.
> Open questions for anyone with experience with supporting multiple
> user access to Beowulf systems. I realize most of these are even more
> vague than my question above, but any input (no matter how anecdotal)
> would be helpful.
> 1- How much scheduling will we have to do? Will we see a graceful
> degradation of the system if multiple users ignore each other and run
> their jobs simultaneously? How will this affect things like Matlab*p
> and ScaLAPACK?
> 2- How many people are we going to need to dedicate to the software
> side of maintaining the cluster and helping researchers solve their
> problems, given that most of them are either doing batch parallelism
> or using tools (Matlab*p) that just make things magically happen? Is
> it going to be a full time to support 10 researchers that don't want
> to learn parallel programming?
> Specific questions:
> >From playing with our test system running Scyld, it looks like the
> root node is a compute node as well, and so should be made homogeneous
> with the cluster. However, this is not stated explicitly that I
> noticed. Is this the case?
> Does anyone have any comments on the Racksaver RS-1200 compute node,
> in the 2-Athlon in 1U configuration (or even the 2-pentium in 1U
> config)? We like the node we have for testing, but wonder what life
> with 64 of them will be like.
> Ray Jones
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
Keith Underwood Parallel Architecture Research Lab (PARL)
keithu at parl.clemson.edu Clemson University
More information about the Beowulf