[Beowulf] Mutiple IB networks in one cluster

Prentice Bisbal prentice.bisbal at rutgers.edu
Tue Feb 4 07:30:01 PST 2014


On 02/01/2014 11:17 AM, atchley tds.net wrote:
> On Fri, Jan 31, 2014 at 11:27 AM, Prentice Bisbal 
> <prentice.bisbal at rutgers.edu <mailto:prentice.bisbal at rutgers.edu>> wrote:
>
>     Alex,
>
>
>     On 01/30/2014 07:15 PM, Alex Chekholko wrote:
>
>         Hi Prentice,
>
>         Today, IB probably means Mellanox, so why not get their pre-sales
>         engineer to draw you up a fabric configuration for your
>         intended use
>         case?
>
>
>     Because I've learned that sales people will tell you anything is
>     possible with their equipment if it means a sale.
>     I posted my question to this list instead of talking to Mellanox
>     specifically to get real-world, unbiased information.
>
>
>         Certainly you can have a fabric where each host has two links, and
>         then you segregate the different types of traffic on the different
>         links.  But what would that accomplish if they're using the same
>         fabric?
>
>
>     Doesn't IB use cross-bar switches? If so, the bandwidth between
>     one pair of communicating hosts should not be affected by
>     communication between another pair of communicating hosts.
>
>
> The cross-bar switch only guarantees non-blocking if the two ports are 
> on the same line card (i.e. using the same crossbar). Once you start 
> traversing multiple crossbars, you are sharing links and can 
> experience congestion.

Scott, You're right. I wasn't thinking when I made that earlier 
statement. As soon as I read your reply, I facepalmed. D'oh!

>
>         Certainly you can have totally separate fabrics and each host
>         could
>         have links to one or more of those.
>
>         If this was Ethernet, you'd comparing separate networks vs
>         multiple
>         interfaces on the same network vs bonded interfaces on the same
>         network.  Not all the concepts translate directly, the main
>         one being
>         the default network layout, Mellanox will suggest a strict fat
>         tree.
>
>         Furthermore, your question really just comes down to performance.
>         Leave IB out of it.  You're asking: is an interconnect with
>         such and
>         such throughput and latency sufficient for my heterogeneous
>         workload
>         comprised of bulk data transfers and small messages.  Only you can
>         answer that.
>
>
>     This question does not "come down to performance", and this
>     question is specifically about IB, so there's no way to leave IB
>     out of it.
>
>     This is really a business/economics question as much as it's about
>     performance: Is it possible to saturate FDR IB, and if so, how
>     often does it happen? How much will it cost for a larger or second
>     IB switch and double the number of cables to make this happen? And
>     how hard will it be to set up? Will the increased TCO be justified
>     increase in performance? How can I measure the increase in
>     performance? How can I measure, in real-time, the load on my IB
>     fabric, and collect that data to see if the investment paid off?
>
>
> Generally (lots of hand waving), HPC does not saturate the fabric for 
> IPC unless is it a many-to-one (e.g. collective). Where lots of 
> bandwidth makes the most difference is for I/O. Distributed file 
> systems probably put the most bandwidth load on the system.
> Scott

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20140204/b4d17dfd/attachment.html>


More information about the Beowulf mailing list