[Beowulf] New HPCC results and the Myri viewpoint

Wed Jul 20 22:37:27 PDT 2005

Hi Stuart,

Stuart Midgley wrote:
> Actually, I tend to disagree with your comment here.  The curve tells  
> you one of the characteristics of the network, which is VERY useful  in 
> evaluating a network before you expend time/effort testing your  code on 
> it (assuming you know your code well).  On its own (without  lots of 
> other micro benchmarks) I agree that it is useless.

Yes, Keith noted it also, it's useful to evaluate the receive rate of a 
N-to-1 pattern. I meant that it's useless to optimize the send side in 
this case.

> In my own experience, I tend to find that most codes are not latency  
> sensitive (that is, QsNetII, Infinipath, Myricom etc are effectively  
> the same, on a latency sense, to most codes)... until they try and  
> scale to the 1000's of cpu's.  All of a sudden simple things like  
> barriers and synchronisation etc can become expensive on networks  with 
> higher latencies.  Things that the software writer wasn't  expensive 
> start to dominate their code.  Hence, the ping-pong  latencies and ring 
> latencies are useful in giving you an idea of how  well the larger codes 
> will scale.

In my experience, the main source of delay for synchronization points 
when the number of nodes increase is jitter between computation phases: 
one node will be late to enter the collective and delay the whole 
sub-tree. The other source is contention in the fabric, specially at 
1000's of nodes, which ring latency tests don't really exercise.

Ring latencies are a step in the good direction though, but it still 
quite analytic IMHO.

Patrick
-- 

Patrick Geoffray
Myricom, Inc.
http://www.myri.com