Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Infiniband modular switches

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Gilad Shainer Shainer at mellanox.com
Thu Jun 26 17:16:52 PDT 2008


 
Patrick Geoffray wrote:

> > There are cases where adaptive routing will show a benefit, and 
> this is why 
> > we see the IB vendors add adaptive routing support as well. But in 
> > general, the average effective bandwidth is much much 
> higher than the 
> > 40% you claim.
> 
> Have a look at the slides 17 and 19 of the following set of 
> slides (and slides 21 and 22 to illustrate my point above):
> http://www.openib.org/archives/spring2007sonoma/Monday%20April
%2030/Leininger-Seager-Adaptive-Routing-OFA-Sonoma-2007-v03.pdf
> 

Not only that I was there, but also had conversations afterwards. It is
a really "fair" comparison when you have different injection
rate/network capacity parameters. You can also take 10Mb and inject it
into 10Gb/s network to show the same, and you always can create the
network pattern to show what you want to show, but you prove nothing
here. I am not favor of static routing only or adaptive routing only,
and having both options is the most flexible solution. 


> Hoefler and al have shown an average effective bisection of 
> ~40% on Infiniband (OMNeT simulations) in a paper submitted 
> to Cluster2008. In a paper to be presented at Hot 
> Interconnects this year, I have measured the effective 
> bisection (SendRecv on random pairs) on a 512-node Myri-10G 
> cluster (single enclosure, 32-port crossbars) under various 
> routing implementations. Below is the link to pretty graphs 
> with static and probing adaptive routing:
> http://patrick.geoffray.googlepages.com/staticvsadaptiverouting
> 
> You can see that the worst case static routing goes quickly 
> below 40%, but the average eventually goes there as well.
> 

So what is your proof point here? I am sure you will find many cases
that static routing will do better (definitely on other interconnects)
and cases for adaptive routing. 


> > There are some vendors that uses only the 24 port switches to build 
> > very large scale clusters - 3000 nodes and above, without any 
> > oversubscription, and they find it more cost effective. 
> Using single 
> > enclosures is easier, but the cables are not expensive and 
> you can use
> 
> Price of cables usually depends on the length (copper and 
> fiber). Using small switches at the edges allows to use very 
> short cables to the hosts
> (in-rack) but you still have to use the same number of longer 
> cables to connect to the spine. With a single enclosure, you 
> may need longer cables to reach the hosts (different rack), 
> but you don't need cables to the spine as they are on the 
> switch backplane (and PCB is free). Short cables may not be 
> expensive, but they are not free. Furthermore, physical 
> cables are much less reliable than wire on PCB, and they take 
> more space, more power.
> 


Again, case by case. You can build large cluster with very short cables.
Some vendors find it better and some preferred to use large switches -
the largest one is the 3456 port switch from Sun - used in the #4 on the
Top500 (TACC)







More information about the Beowulf mailing list