[Beowulf] Correct networking solution for 16-core nodes

Kevin Ball kball at pathscale.com
Fri Aug 4 10:25:07 PDT 2006


Gilad,

> 
> There was a nice debate on message rate, how important is this factor
> when you
> Want to make a decision, what are the real application needs, and if
> this is 
> just a marketing propaganda. For sure, the message rate numbers that are
> listed 
> on Greg web site regarding other interconnects are wrong. 
> 
> I would take a look on the new cluster in Tokyo institute of technology.
> The 
> Servers there are "fat nodes" too.  

I did some looking at the performance results posted for the TITECH
cluster (you can find some they posted at
http://www.gsic.titech.ac.jp/%7eccwww/tgc/bm/index.html)

While they don't report on anything that directly measures messaging
rate, they do report HPCC numbers, and you can see from their random
ring latencies that their interconnect latency does not scale well with
added cores per node.  They report (over 648 nodes) the following times:

         Random Ring Latency
1ppn     13.64 usec
2ppn     23.91 usec
4ppn     44.21 usec
8ppn     74.77 usec
16ppn    131.5 usec

  One downside of using this cluster (particularly with the current
reported results) to analyze interconnect related things, is that their
architecture oversubscribes the top tier of their switching network by a
factor of 5:1.  Despite this, many of their reported results are using
the whole cluster.  Seeing results on sub-clusters of 120 compute nodes
with full bisection bandwidth would probably be more interesting.

  Also interesting to look at their results from within a node.  They
also report on the various NAS parallel benchmarks, and from both those
and the HPCC results it appears to me that within a node scaling works
well up to 4 ranks/node, starts falling over or is flat at 8 ranks/node,
and really starts to hurt at 16.  It looks to me like 8 ranks/node is
probably the most anyone running HPC apps should be looking for, and
that 4 is probably the sweet spot, especially if the interconnect can
handle it.

-Kevin


> 
> Gilad.
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list