[Beowulf] Connecting two 24-port IB edge switches to core switch:extra switch hop overhead
iioleynik at gmail.com
Wed Feb 11 07:44:23 PST 2009
Thanks for your reply. As I explained in my original email 48-port IB switch
would be ideal because the jobs on these 36 nodes will mostly be run locally
within the 36-node complex. However, 48-port IB switch is too expensive,
that is why I am considering alternative cost-effective solutions. I think
we major pattern of the load will be a bunch of 32-64 cpu jobs at maximum,
i.e each of them can fit into 4-8 nodes. These jobs are MPI the
applications, therefore, they require the best bandwidth-latency
On Tue, Feb 10, 2009 at 3:55 PM, Nifty Tom Mitchell
<niftyompi at niftyegg.com>wrote:
> With 36 hosts the best connectivity between the 36 hosts will be with a
> single 48 port switch and use many of the extra ports to link to the
> It is not insane to cost out and plan a fabric design with two, three or
> 24 port switches including cables. Three times 24 (or more) switch
> designs can make it clear what value a big switch brings to your game.
> What we do not know is the job mix and the demands that mix will place
> on the fabric. If the job mix is 99% 32 host jobs that are bandwidth
> and latency limited then the big switch may quickly show it's value.
> If the mix is lots of 23 or less host jobs then 24 port switch solutions
> will behave nearly ideal.
> Having looked at a lot of university clusters lots of 23 or less host
> jobs seems like a common work load. Thus a pair of 24 port switches
> will be fine with the right job scheduling.
> My gut is that two 24 port switches that: share five links, have
> 18 hosts per switch and with the last two links connected to your
> existing fabric will operate quite well.
> One important IB cluster design point is the cable link lengths at fast
> rates. Smaller switches can be located to reduce host to switch and switch
> to switch link lengths. Also for fast link speeds watch: bend radius,
> cable quality
> and other cable management issues, they matter.
> On Tue, Feb 10, 2009 at 12:01:26AM -0500, Ivan Oleynik wrote:
> > It would be nice to have non-blocking communication within the entire
> > system but the critical part is the 36-node complex to be connected to
> > the main cluster.
> > On Mon, Feb 9, 2009 at 1:33 AM, Gilad Shainer <
> Shainer at mellanox.com>
> > wrote:
> > Do you plan to have full not blocking communications between the next
> > systems and the core switch?
> > __________________________________________________________________
> > From: beowulf-bounces at beowulf.org
> > [mailto:beowulf-bounces at beowulf.org] On Behalf Of Ivan Oleynik
> > Sent: Sunday, February 08, 2009 8:20 PM
> > To: beowulf at beowulf.org
> > Subject: [Beowulf] Connecting two 24-port IB edge switches to core
> > switch:extra switch hop overhead
> > I am purchasing 36-node cluster that will be integrated to already
> > existing system. I am exploring the possibility to use two 24 4X port
> > IB edge switches in core/leaf design that have maximum capability of
> > 960Gb (DDR)/480Gb (SDR). They would be connected to the main Qlogic
> > Silverstorm switch.
> > I would appreciate receiving some info regarding the communication
> > overhead incurred by this setup. I am trying to minimize the cost of
> > communication hardware. It looks like buying single 48-port switch is
> > really an expensive option.
> > Thanks,
> > Ivan
> T o m M i t c h e l l
> Found me a new hat, now what?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf