[Beowulf] 10G networking?

Robert Horton robh at dongle.org.uk
Fri Jan 23 02:32:00 PST 2015


On Fri, 2015-01-23 at 01:36 -0500, Mark Hahn wrote:
> Hi all,
> I'd appreciate any comments about the state of 10G as a reasonable
> cluster network.  Have you done any recent work on 10G performance?
> 
> https://lwn.net/Articles/629155/

I had a go at using RoCE with some Mellanox NICs a year or so ago (which
uses the OFED stack). I don't have any actual numbers unfortunately but
it's certainly possible to get "decent" performance at least with modest
node counts, although I think Infiniband will perform better. There are
a couple of issues though:

- The configuration, particularly at the switch end, is fairly esoteric.
Mellanox do have some pretty good documentation for their cards but if
you've got a switch from a different manufacturer you may have to play
around a bit. It's certainly rather more involved than getting Infinband
working.

- To get decent performance (at least on latency) you need fairly high
end HCAs, a switch which supports the DCB stuff (I think?) and (Q)SFP+
transceivers / cables, the cost of which is in the same area as
Infiniband.

There are a couple of advantages to 10GE, in that there are ASICs with
higher port counts and it's easy to integrate with an existing ethernet
network but for MPI performance I think Infiniband is still the way to
go.

It occurred to me the other day that it's about time we had something
better than 1GE for commodity networking. It's good news that switch
costs are coming down but I've yet to see a server with an onboard 10gT
adaptor (although I have seen some with SFP+ 10g).

Rob



More information about the Beowulf mailing list