about fast interconnects and SCI in particular
PFENNIGER Daniel
Daniel.Pfenniger@obs.unige.ch
Mon, 14 Jun 1999 13:03:51 -0400
On 14-Jun-99 at 18:23, Florent Calvayrac (fcalvay@aviion.univ-lemans.fr) wrote:
>
> I have been involved since October 1998 in the definition, fund raising
> and purchase of a cluster for computational physics purposes,
> and we are about to take a final decision on the nature of the cluster
> (processors and communications hardware).
>
> I had already asked the following question on comp.parallel last year,
> and got various and interesting answers, but am still in trouble :
>
> -----------------------------------------
> Considering a given total budget (around $100,000) is it better to spend
> nearly all of it into ultrafast communications hardware (say Myrinet or
> SCI) and then to buy 16 CPUs, or to only buy a Fast Ethernet switch and 32
> or 64 (with SMP) faster processors ?
>
> Since several users will be using the system, the needs for communications
> can not be estimated accurately.
>
> ------------------------------------------
> I include a summary of the most informative answers at the end of this
> posting.
>
...
We had a similar decision to make a year ago. Because the ratio of fine
grain to coarse grain computations was also not well defined, we
built for 2/3 of the budget a 66 node PII cluster with switched Fast Ethernet.
After experimenting for 6 months we can now see whether we want to
enhance (in decreasing order of likeliness):
1) the number of processors (all the boards are dual)
2) the node RAM
3) the network (via channel bonding)
4) the hard disks
5) other features
Since in between the component costs have decreased by the standard rate
while our practical experience has increased, we can much better evaluate
which parameter we want to double with the remaining funding.
I would say that fine grain parallel problems are still best performed
on traditional supercomputers, but a lot of applications that used to be
done on supercomputers can now be made as well on Beowulfs for a fraction of
the cost.
Finally, the more simultaneous users are allowed, the
less one should invest in the network, for the obvious reason that
the different concurrent applications are independent from each
others. An expensive network is justified only if one must run
applications on all the nodes simultaneously.
Daniel Pfenniger
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr Daniel Pfenniger | Daniel.Pfenniger@obs.unige.ch
Geneva Observatory, University of Geneva | tel: +41 (22) 755 2611
CH-1290 Sauverny, Switzerland | fax: +41 (22) 755 3983
__________________________________________________________________________