[Beowulf] Re NiftyOMPI Tom Mitchell <niftyompi at niftyegg.com>

Greg Keller Greg at keller.net
Wed Aug 5 08:41:23 PDT 2009


A 3rd option: upgrade your Chassis to 288 ports.  The beauty of SS/ 
Qlogic switches is they all use the same components.  The Chassis/ 
Backplane are relatively dumb and cheap.  You can re-use your spine  
switches and leaf switches.  You don't even need to add the additional  
spine switches if 2:1 blocking is OK.

Be very careful which ports you use to link the switches together if  
you do try and splice 2 chassis together.  SMs can have trouble  
mapping many configrations, and you're probably best off dedicating  
line cards as "Uplink" or "Compute" (but don't mix/match) if I recall  
the layouts correctly.  With these "multi-tiered" switches the SM  
sometimes can't figure out which way is up if you mix the ports  

A 4th Option:  36 Port QDR + DDR
Also note that the QDR switches are based on 36 port chips and not a  
huge price jump (per port), so with a "Hybrid" cable for the uplinks,  
you may be able to purchase the newer technology and block the heck  
out of it.  So adding 48 additional nodes could be as easy as:

Disconnect 48 nodes for uplinks from the core switch
Connect 4 x 36 port QDR with 12 uplinks to each
Connect 48 old, and 48 new nodes to the 36 port QDR "edge"
This leaves you with 96 nodes on each side of a 48 port

Option 3 is the cleanest, and generically my favorite if you can get a  
chassis for a reasonable price.


> Date: Mon, 3 Aug 2009 22:29:50 -0700
> From: NiftyOMPI Tom Mitchell <niftyompi at niftyegg.com>
> Subject: Re: [Beowulf] Fabric design consideration
> To: "Smith, Brian" <brs at admin.usf.edu>
> Cc: "beowulf at beowulf.org" <beowulf at beowulf.org>
> Message-ID:
> 	<88815dc10908032229n35dc509clba0b1a52ab6af8f1 at mail.gmail.com>
> Content-Type: text/plain; charset=UTF-8
> On Thu, Jul 30, 2009 at 8:18 AM, Smith, Brian<brs at admin.usf.edu>  
> wrote:
>> Hi, All,
>> I've been re-evaluating our existing InfiniBand fabric design for  
>> our HPC systems since I've been tasked with determining how we will  
>> add more systems in the future as more and more researchers opt to  
>> add capacity to our central system.  We've already gotten to the  
>> point where we've used up all available ports on the 144 port  
>> SilverStorm 9120 chassis that we have and we need to expand  
>> capacity.  One option that we've been floating around -- that I'm  
>> not particularly fond of, btw -- is to purchase a second chassis  
>> and link them together over 24 ports, two per spline.  While a good  
>> deal of our workload would be ok with 5:1 blocking and 6 hops (3  
>> across each chassis), I've determined that, for the money, we're  
>> definitely not getting the best solution.
>> The plan that I've put together involves using the SilverStorm as  
>> the core in a spine-leaf design.  We'll go ahead and purchase a  
>> batch of 24 port QDR switches, two for each rack, to connect our  
>> 156 existing nodes (with up to 50 additional on the way).  Each  
>> leaf will have 6 links back to the spine for 3:1 blocking and 5  
>> hops (2 for the leafs, 3 for the spine).  This will allow us to  
>> scale the fabric out to 432 total nodes before having to purchase  
>> another spine switch.  At that point, half of the six uplinks will  
>> go to the first spine, half to the second.  In theory, it looks  
>> like we can scale this design -- with future plans to migrate to a  
>> 288 port chassis -- to quite a large number of nodes.  Also, just  
>> to address this up front, we have a very generic workload, with a  
>> mix of md, abinitio, cfd, fem, blast, rf, etc.
>> If the good folks on this list would be kind enough to give me your  
>> input regarding these options or possibly propose a third (or  
>> forth) option, I'd very much appreciate it.
>> Brian Smith
> I think the hop count is a smaller design issue than cable length for
> QDR.  Cable length and the
> physical layout of hosts in the machine room may prove to be the
> critical issue in
> your design.    Also since routing is static some seemingly obvious
> assumptions about
> routing, links, cross sectional bandwidth and blocking can be non- 
> obvious.
> Also less obvious to a group like this is your storage, job mix and
> batch system.
> For example in a single rack with a pair of QDR 24 port switches.  You
> might wish
> to have two or three links connecting those 24 port switches directly
> at QDR rates.
> Then the remaining three or four links would connect (DDR?) back to
> the 144 switch.
> If the batch system was 'rack aware' jobs that could run on a single
> rack would and
> jobs that had ranks scattered about would see a lightly loaded  
> central switch.
> Adding QDR to the mix as you scale out to 400+ nodes using newer multi
> core processor
> nodes could be fun.
> When you knock on vendor doors ask about optical links...  QDR optical
> links may let you reach
> beyond some classic fabrics layouts as your machine room and cpu core
> count grows.
> -- 
>        NiftyOMPI
>        T o m   M i t c h e l l
> ------------------------------
> Message: 2
> Date: Tue, 4 Aug 2009 14:48:21 -0400
> From: Brock Palen <brockp at umich.edu>
> Subject: [Beowulf] force factory rest of sfs7000 (topspin 120)
> To: Bewoulf <beowulf at beowulf.org>
> Message-ID: <635DE2F6-3A2C-4A58-91F1-072288667650 at umich.edu>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
> We have a cisco sfs7000 (maybe still under support waiting on cisco)
> also known as a topspin 120, IB switch.
> We cannot login with the password we (thought) had it set to. I have
> looked online and find little tonight about forcing the switch back to
> factory defaults without a login.
> Serial console works fine, just can't login.  We can screw in firmware
> a little by stopping boot, just don't know what to do from there.  If
> anyone has directions how to force sfs7000 to factory defaults, or
> password recovery help would be great.
> Brock Palen
> www.umich.edu/~brockp
> Center for Advanced Computing
> brockp at umich.edu
> (734)936-1985
> ------------------------------
> _______________________________________________
> Beowulf mailing list
> Beowulf at beowulf.org
> http://www.beowulf.org/mailman/listinfo/beowulf
> End of Beowulf Digest, Vol 66, Issue 3
> **************************************

More information about the Beowulf mailing list