[Beowulf] Altix vs. Beowulf

Vincent Diepeveen diep at xs4all.nl
Wed Apr 20 04:02:08 PDT 2005

My experience is that when altix has numalink3 the latencies even at 64
processors are, as you can see in the report of prof Aad v/d Steen who
tested it for dutch government, about 3-4 us.

Any cluster gives that too for a far smaller price.

The only advantage of altix is that it has a shared memory model. 

So if you use shared memory up to a cpu or 16, it's a fine machine.

Above that i definitely would consider getting a cluster with a good network.

Please note that several network cards also have build in shared memory at
the card. Quadrics has a shmem library for example at a real low cost for 8

If you buy for $13k a quadrics 8 node network, and 8 quad opterons for $9000,
then for far under $100k you have yourself a 32 processor machine, which
you can very cheaply update to dual core in start 2006 too, so you always
run with the latest processors.

Anything faster than that at the average integer program you won't find.

To get diep fast at an itanium2, i needed to do 24 hours of PGO. Even then
my integer program had the same speed like an opteron at 1.3Ghz or a K7 at
2.0Ghz or a P4 at 3.2Ghz (prescott core).

Majority of jobs will run probably very nicely within 1 node at a quad and
when using quadrics the freak users can speed up things a lot using shmem,
whereas when i tested it at mpi speeds it was very fast too. Faster than
myri for example. 

For real fast latency between only a very few nodes, Dolphin seems to have
an excellent network.

For reasonable latency and cheap network cards, Myri is a good choice.

Either of those 3 brands will get you roughly against or above 1 gigabyte a

I hope you realize that despite all kind of fairy tales the difficult
routing system in altix3000 has a few disadvantages and i wonder whether at
each node (which is a dual itanium2, a brick is 4 cpu's) you can get stream
PRACTICAL (considering the shub limitations) to it. 

Reads can be done simultaneously, but i suppose that you want to write a
lot and that can't be done simultaneously.

In Diep reads are most important. The previous hub version as you can read
at SGI homepage has a theoretical limit of 680 MB/s (that's to a quad,
altix uses duals).

Scheduling is a big problem at altix when running several jobs
simultaneously at it, because different routers connect to the same brick
(4 processors).

It's basically a good machine if you have the money and if you want to run
up to a cpu or 16 and need shared memory without using MPI calls, but using
for example the C commands:

Above 16 cpu's i'd go for a cluster.

At 03:32 PM 4/15/2005 -0400, Peng Li wrote:
>We might get a grant for a new cluster/HPC server in a few months. The
cluster/server will be mostly used to run MPI, PVM programs. We have a
home-built 16-node P3 cluster running Rocks 3.3.0 and an Aspen cluster
running Redhat 9.0. We are not quite satisfied with the built quality and
reliability of the current clusters.
>Any experience with SGI Altix? Does Altix have real advantages over
Beowulf? Anyone has any recommendations for good cluster vendors?
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list