[Beowulf] Infiniband modular switches

Ramiro Alba Queipo raq at cttc.upc.edu
Fri Jun 13 08:19:28 PDT 2008

On Thu, 2008-06-12 at 10:36 -0400, Joe Landman wrote:
> Ramiro Alba Queipo wrote:
> > Hello everybody:
> > 
> > We are about to build an HPC cluster with infiniband network starting
> > from 22 dual socket nodes with AMD QUAD core processors and in a year or
> > so we will be having about 120 nodes. We will be using infiniband both
> > for calculation as for storage.
> Hi Ramiro:
>    You may experience some contention issues in this case if your code 
> is very latency sensitive, and you do lots of IO.

Our software is home-made on CFD using MPI (lam until now and openmpi
from now on) but our solvers are neither very latency sensitive nor do
lots of IO at this moment, so I think that now it is a sensible desition
¿What do you think?

> > The question is that we need a modular solution and we are having 3
> > candidates:
> > 
> > a) Voltaire Grid Director SDR or DDR 288 ports (9988 or 2012 models)->
> > seems very good and well supported, but very expensive.
> > 
> > b) Qlogic SilverStorm 9120 (144 ports) -> no price and support
> > information yet
> > 
> > c) Flextronics 10U 144 Port Modular-> very good at price but little
> > support => risky option?.
> The Flextronics units are Mellanox IP/chips inside (as are, I believe, 
> many/most of the others).  That is, the risk is low from a "will it 
> work" view.  Flextronics is an ODM, so they may not provide the levels 
> of support around the system that you might get with Voltaire et al.
> Do you want/need a 1:1 architecture (e.g. all ports are the same number 
> of switch hops from each other), or are you able/willing to look into 
> oversubscribed links?  Part of this has to do with your traffic 
> patterns, your code requirements on latency, and your storage bandwidth.

I do not know many details, but we are using nowadays solvers adapted to
live with high latencies, and with infiniband we expect to scale better
and we expect to run tasks using 500 (8 cores/node) or more cores
(thought not right now). In fact we are doing some test at Marenostrum
supercomputer in Barcelona with about 1000 cores.
The question that worries me is if we will be limited at mid-term by a
solution based on 24 ports switches joined by say 4 ports (not a
fat-tree topology which waste a lot more of ports), and be loosing a
latency/bandwidth that we then will be needing when having 130 ports at
the end of next year.

By the way:

a) How many hops a Flextronics 10U 144 Port Modular is doing?
b) And the others?
c) How much latency am I loosing in each hop? (In the case of Voltaire
switches: ISR 9024 - 24 Ports: 140 ns ; ISR 2004 - 96 ports: 420 ns
d) Each port I am using to connect a switch to another one is summing up
its bandwidth to the total (20 Gb/s * 4 = 80 Gbs when using 4 ports to

The alternatives are:

a) Start with a good 24 port swith and grow up loosing latency and
b) Buy a 48 or 96 ports spending more money to have more ports at full
c) Use the Flextronix 10U 144 Port Modular solution which will allow us
to scale well in a couple years

Thanks for your answer


Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
For all your IT requirements visit: http://www.transtec.co.uk

More information about the Beowulf mailing list