[Beowulf] Questions about upgrading InfiniBand

Prentice Bisbal prentice at ias.edu
Wed Apr 18 12:02:26 PDT 2012


Aggregation spine? Can you tell me more about that? Can you give me a
part/model number?

Prentice 

On 04/18/2012 11:22 AM, Andrew Howard wrote:
> I would talk to Mellanox about your options for switch topology. We
> opted not to go with the single 648-port FDR director switch, but
> instead use top-of-rack leaf switches (the 36-port guys) and then an
> aggregation spine to connect those. It performs beautifully. It also
> means we don't have to worry about buying longer (more expensive)
> cables to run to the director switch, we can buy the shorter cables to
> run to the rack switch and then only have to buy a few 10M cables to
> run to the spine.
>
> --
> Andrew Howard
> HPC Systems Engineer
> Purdue University
> (765) 889-2523
>
>
>
> On Wed, Apr 18, 2012 at 11:05 AM, Prentice Bisbal <prentice at ias.edu
> <mailto:prentice at ias.edu>> wrote:
>
>     Beowulfers,
>
>     I'm planning on adding some upgrades to my existing cluster, which has
>     66 compute nodes pluss the head node. Networking consists of a Cisco
>     7012 IB switch with 6 out of 12 line cards installed, giving me a
>     capacity of 72 DDR ports, expandable to 144, and two 40-port ethernet
>     switches that have only six extra ports between them.
>
>     I'd like to add a Lustre filesystem (over InfiniBand)  to my cluster,
>     and then begin adding/replacing nodes in the cluster. Obviously, I'll
>     need to increase capacity of both my IB and ethernet networks. The
>     questions I have are about upgrading my InifiniBand.
>
>     1. It looks like QLogic is out of the InfiniBand business. Is Mellanox
>     the only game in town these days?
>
>     2. Due to the size of my cluster, it looks like buying a just a
>     core/enterprise IB switch with capacity for ~100 ports is the best
>     option (I don't expect my cluster to go much bigger than this in the
>     next 4-5 years).  Based on that criteria, it looks like the Mellanox
>     IS5100 is my only option. Am I over looking other options?
>
>     http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49
>     <http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=71&menu_section=49>
>
>     3. In my searching yesterday, I didn't find any FDR core/enterprise
>     switches with > 36 ports, other than the Mellanox SX6536. At 648
>     ports,
>     the SX6536is too big for my needs. I've got to be over looking other
>     products, right?
>
>     http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49
>     <http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=122&menu_section=49>
>
>     4. Adding an additional line card to my existing switch looks like it
>     will cost me only ~$5,000, and give me the additional capacity
>     I'll need
>     for the next 1-2 years. I'm thinking it makes sense to do that,
>     and wait
>     for affordable FDR switches to come out with the port count I'm
>     looking
>     for instead of upgrading to QDR right now, and start buying hardware
>     with FDR HCAs in preparation for that.  Please feel free to
>     agree/disagree. This brings me to my next question...
>
>     5. FDR and QDR should be backwards compatible with my existing DDR
>     hardware, but how exactly does work? If I have, say an FDR switch
>     with a
>     mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to
>     the lowest-common denominator, or will the slow-down be based on
>     the two
>     nodes involved in the communication only? When I googled for an
>     answer,
>     all I found were marketing documents that guaranteed backwards
>     compatibility, but didn't go to this level of detail, I searched the
>     standard spec (v1.2.1), and didn't find an obvious answer to this
>     question.
>
>     6. I see some Mellanox docs saying their FDR switches are
>     compliant with
>     v1.3 of the standard, but the latest version available for download is
>     1.2.1. I take it the final version of 1.3 hasn't been ratified yet. Is
>     that correct?
>
>     --
>     Prentice
>
>     _______________________________________________
>     Beowulf mailing list, Beowulf at beowulf.org
>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>     To change your subscription (digest mode or unsubscribe) visit
>     http://www.beowulf.org/mailman/listinfo/beowulf
>
>



More information about the Beowulf mailing list