Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] What is a "proper" machine count for a cluster

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Thu Mar 15 09:22:12 PDT 2007


> Now comes the pitch.  The "sweet spot" in cluster nodes at the moment seems 
> to be a dual socket, dual core Opteron or Intel machine with 2 gigs of RAM 
> per core, so each box is a 4-way SMP
> (I will be flayed alive by the list for such cavalier numbers).

merely scourged ;)

I agree with the config you mention for many purposes.  I would almost
certainly consider using 4-core chips for a serial cluster, though I don't
know offhand whether dual-socket would be more cost-effective than single.

it's also worth pointing out that the first fork in the decision-tree is 
really whether to use desktop or server components.  with servers, the 
sweet-spot does tend towards 2-socket.  there are many attractive things 
about this approach, such as frequent builtin dual-gigabit, more robust 
components such as ECC, and important managability features such as IPMI.

OTOH, there is some attraction to going cheap (arguably more beowulfy!)
by using desktop-grade single-socket boards/chips.  if you take this
approach, you're guided towards some other config details - you probably
won't get ECC, for instance (saving some price, but should be considered).
to be cost-effective, you also need to choose a cheap chassis - desktop
boxes piled on wire racks are reasonable, though there are cheap 1U chassis,
as well.  such an approach is cost-sensitive, so you want to configure with
no more ram than necessary and quite possibly no local disk.

> Now, such a box is likely to have oodles (scientific term) more processing 
> power than four pentiums.

just some blue-sky numbers: a 5-year old machine will have individual 
performance figures which are about 1/4 of todays hardware.  that includes
interconnect (gigabit was uncommon then), memory bandwidth, peak flops,
onchip cache, etc.  any real app will probably not pin _all_ the components,
but the fractions are multiplicative, so a delivered performance is some 
weighted _product_ of the factors.  I'd expect somewhere between 5:1 and 20:1.



More information about the Beowulf mailing list