[Beowulf] Questions about upgrading InfiniBand

Prentice Bisbal prentice at ias.edu
Wed Apr 18 11:59:25 PDT 2012


Gilad,

Thanks for the quick, helpful responses. See my in-line comments below.

On 04/18/2012 11:27 AM, Gilad Shainer wrote:
>> Beowulfers,
>>
>> I'm planning on adding some upgrades to my existing cluster, which has
>> 66 compute nodes pluss the head node. Networking consists of a Cisco
>> 7012 IB switch with 6 out of 12 line cards installed, giving me a capacity of 72
>> DDR ports, expandable to 144, and two 40-port ethernet switches that have
>> only six extra ports between them.
>>
>> I'd like to add a Lustre filesystem (over InfiniBand)  to my cluster, and then
>> begin adding/replacing nodes in the cluster. Obviously, I'll need to increase
>> capacity of both my IB and ethernet networks. The questions I have are
>> about upgrading my InifiniBand.
>>
>> 1. It looks like QLogic is out of the InfiniBand business. Is Mellanox the only
>> game in town these days?
> Intel bought the QLogic InfiniBand business so this is a second option

I searched both the QLogic and Intel websites for 'InfiniBand", and
neither returned any hits yesterday. It makes sense that you can't find
any IB info on QLogic's site anymore. Today, I was able to find the Link
for Intel TrueScale InfiniBand products. Intel did a good job of
hiding/burying the link under "More Products" on their Products
pull-down menu. No idea why I couldn't find it by searching yesterday.
Typo in search box, maybe?

>
>> 2. Due to the size of my cluster, it looks like buying a just a core/enterprise IB
>> switch with capacity for ~100 ports is the best option (I don't expect my
>> cluster to go much bigger than this in the next 4-5 years).  Based on that
>> criteria, it looks like the Mellanox
>> IS5100 is my only option. Am I over looking other options?
> You can also take 36 port switches, few more cables, and build the desired network size (for example  for Fat Tree topology). It is easy to do, might be more cost effective. If you need help to design the topology (which ports connects to which port, I can send you a description). With this option, you can also do any kind of oversubscription if you want to.

I was looking into a fat-tree topology yesterday. Considering the number
of additional switches needed, and the cabling costs, I'm not sure it
will really be cost effective. Just to stay at the same capacity I'm at
now, 72 ports, I'd need to by 6 switches + cables.

>
>> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_
>> family=71&menu_section=49
>>
>> 3. In my searching yesterday, I didn't find any FDR core/enterprise switches
>> with > 36 ports, other than the Mellanox SX6536. At 648 ports, the SX6536is
>> too big for my needs. I've got to be over looking other products, right?
>>
>> http://www.mellanox.com/content/pages.php?pg=products_dyn&product_
>> family=122&menu_section=49
> More options are getting out now. 324-port version will be available in a week, and the 216 few weeks after. Before the summer that 108 will be released.

That's in my timeframe, so I'll keep an eye on the Mellanox website.

>
>> 4. Adding an additional line card to my existing switch looks like it will cost
>> me only ~$5,000, and give me the additional capacity I'll need for the next 1-
>> 2 years. I'm thinking it makes sense to do that, and wait for affordable FDR
>> switches to come out with the port count I'm looking for instead of
>> upgrading to QDR right now, and start buying hardware with FDR HCAs in
>> preparation for that.  Please feel free to agree/disagree. This brings me to my
>> next question...
> Depends what you want to build. You can take FDR today, build 2:1 oversubscription to get "QDR" throughput and this will be cheaper than using QDR switches. In any case, if you need any help on the negotiation side, let me know.

Thanks for the offer. If I decide to buy new switches instead of
expanding my DDR switch, i'll e-mail you off-list.

>
>> 5. FDR and QDR should be backwards compatible with my existing DDR
>> hardware, but how exactly does work? If I have, say an FDR switch with a
>> mixture of FDR, QDR, and DDR HCAs, will the whole fabric slow down to the
>> lowest-common denominator, or will the slow-down be based on the two
>> nodes involved in the communication only? When I googled for an answer,
>> all I found were marketing documents that guaranteed backwards
>> compatibility, but didn't go to this level of detail, I searched the standard
>> spec (v1.2.1), and didn't find an obvious answer to this question.
> You can mix and match anything on the InfiniBand side. You can connect SDR, DDR, QDR and FDR and it all will work. When you do that, a direct connection between 2 ports will be run at the common denominator. So if you have FDR port connected to FDR port directly, it will run FDR. If you have DDR port connected directly to FDR port, that connection will run DDR. In your case, part of the fabric will run FDR, part will run DDR.

That's what I suspected. Thanks for the confirmation.
>
>  
>> 6. I see some Mellanox docs saying their FDR switches are compliant with
>> v1.3 of the standard, but the latest version available for download is 1.2.1. I
>> take it the final version of 1.3 hasn't been ratified yet. Is that correct?
>
> 1.3 is the IBTA spec that includes FDR and EDR. The spec is completed, but not on the web site yet.

Ditto.

--
Prentice



More information about the Beowulf mailing list