[Beowulf] Cluster Diagram of 500 PC

Robert G. Brown rgb at phy.duke.edu
Sun Jul 8 15:18:39 PDT 2007


On Fri, 6 Jul 2007, Mostafiz wrote:

>  Dear Sir,
>
>  We want to setup a Cluster of 500 PC in following Configuration:
>  Intel Duel Core 1.88 GHz 2MB Cache
>  4 GB DDR-2 RAM
>  2 X 80 GB
>
>  How dow we connect these computers and how many will be defined as master.
>  How do we conect using how many switch.
>  How power connection will be provided.
>  How do we start and stop all nodes using a remote computer.
>  How do we ensure fault tolarent network connectivity.
>  We want to use windows  XP or windows 2003 as OS. Better persormance Centos ro Linux RHL may be selected.
>  please advice us and help me in providing a network diagram of the system

Dear Mostafiz,

There is a free online book on cluster engineering here:

   http://www.phy.duke.edu/~rgb/Beowulf/beowulf_book.php

that you might want to read over to get yourself familiar with the
concepts and design constraints.

Second, the standard answer to all of your questions above is:  "First
think about your application(s), THEN engineer your cluster.  In other
words, don't pick your node hardware first.  The correct way to engineer
a cost-benefit optimal cluster that does the most work for the least
up-front cost and long term administrative expense is to think about the
computations you want to perform, and then decide on the network and
compute hardware simultaneously.

Once the application mix is understood, most of your decisions above are
pretty obvious.  If your application is "real parallel software" using
MPI and with a large communication requirement between nodes, you will
need to invest much more heavily in network than in nodes -- fewer nodes
and an expensive network will get more work done than more nodes and a
cheap network, as your nodes will sit idle waiting on the network.  If
your application is a lot of very simple tasks that can run completely
independently and don't need a lot of access to disk or other nodes and
aren't all linear algebra (and hence memory intensive) then you want to
minimize investment in the network and maximize the compute capacity of
your nodes (which might or might not be achieved with Intel CPUs,
depending on the code).

The main thing is to COMPLETELY UNDERSTAND the problem(s) you wish to
run and how they will fit onto the hardware you will use before making
any definitive choices regarding that hardware.

Incidentally, I would personally strongly advise you against building a
Windows cluster.  First of all, it will cost vastly more money as you
add several hundred dollars in completely unnecessary software cost per
node -- for 500 nodes at $200 each for XP-Pro that's an extra $100,000
right there, and even at $20 per node $10,000 is still far too much.

Linux is free and is VASTLY more efficient.  It is also far more
flexible regarding cluster configuration.  Finally, nearly everyone on
this list runs and builds linux clusters, for good reasons.  I
occasionally struggle with dealing with Windows in SMALLISH
client/server LAN environments, and believe me, it is nightmarish
compared to linux.

Good luck.  You can probably find consultants on this list to help you
with the above design process at a fairly reasonable price, at least if
your project isn't classified (so that you can't actually TELL us what
you're going to use the cluster for...;-).  In the latter case -- which
I imagine isn't that unlikely -- then you'll simply have to develop the
local expertise to answer the right questions on your own to guide your
design process.  My book should help -- and the list will generally
answer sufficiently specific questions that HAVE an answer...

   rgb

>
>  Regards.
>
>  Mostafiz.
> Lt Col
> IT directorate
> Bangladesh
> Dhaka-1206
> Tel: 880-2-8752348
> Cell: 880-1711431880
>
>
>
> ---------------------------------
> Boardwalk for $500? In 2007? Ha!
> Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu





More information about the Beowulf mailing list