networking options

Richard Walsh rbw at ahpcrc.org
Fri Sep 27 07:38:46 PDT 2002


Patrick Geoffray wrote:

> Many people liked the T3E much more because its software (parallel
> execution environment, compiler, tools, etc) than its hardware. The
> exception in software was its MPI implementation, performing very poorly
> compared to its SHMEM interface. 

  Mmm ... well now Patrick, you are making MPI on the T3E look much 
  worse that it really is ...

  And SHMEM is a high bar to jump, no?  I just ran PMB on a T3E/1200 
  (1088 PEs) for you ... MPI ping-pong latency on for messages under 512 
  bytes is less than 11 usecs (under 4 usecs for 1 byte) and bandwidth 
  exceeds 300 MBs/sec for large messages ... in MPI.  Not exactly pedestrian 
  ... Our folks have not even bothered to convert to SHMEM because their 
  direct MPI compiles work fine (less than 10% communication cost at 
  1000 PEs).  I am sure there are codes that could use SHMEM's still lower
  latencies, but for our stuff T3E + 3D Torus + MPI is plenty good.

  The T3E is/was a uniquely successful combination of hardware and 
  software design. The fact that systems have even been sold this 
  year almost ten years after its original offering is a testimony
  to this fact. I would say it is a success because of:

  1. True single system image and full featured operating env.
  2. High reliability (>95% utilization of 1088 PEs/year).
  3. High bandwidth low-latency network.
  4. High scalability.

  Yes ... the processor is now slow ... but primarily because SGI
  decided not to support the follow-on product (you can decide
  why) ... but imagine a similar upgraded T3G(?) package with a 
  1000 MHz Alpha 21364 processor in each node.

  But that was then and this is now ...

> 
> The link bandwidth needs to be much more that the I/O bandwidth to be
> comfortable in a Torus (shared links). It will certainly happen in the
> future, but as the PCI bandwidth is increasing quickly (500 MB/s PCI, 1
> GB/s PCI-X, 2-4 GB/s PCI Express), it will be technologically hard (and
> expensive) to increase the link bandwidth X times as fast, IMHO.

  Yes, PCI 64/66 removed the PCI bus as the bottle neck in SCI Torus inter-
  connects. A PCI-X 64/100 running at say 75% efficiency offers:

   8 * 100 * .75  ==  ~ 600 MB/sec 

  I believe for some mother boards, current SCI cards can deliver 70% of 
  this  (~400 MB/sec).

  rbw
#---------------------------------------------------
# Richard Walsh
# Project Manager, Cluster Computing, Computational
#                  Chemistry and Finance
# netASPx, Inc.
# 1200 Washington Ave. So.
# Minneapolis, MN 55415
# VOX:    612-337-3467
# FAX:    612-337-3400
# EMAIL:  rbw at networkcs.com, richard.walsh at netaspx.com
#         rbw at ahpcrc.org
#
#---------------------------------------------------
# "What you can do, or dream you can, begin it;
#  Boldness has genius, power, and magic in it."
#                                  -Goethe
#---------------------------------------------------




More information about the Beowulf mailing list