[Beowulf] ethernet bonding performance comparison "802.3ad" vs Adaptive Load Balancing
siegert at sfu.ca
Wed Sep 17 21:25:36 PDT 2008
On Wed, Sep 17, 2008 at 07:51:01PM -0500, Rahul Nabar wrote:
> On Wed, Sep 17, 2008 at 7:23 PM, Eric Thibodeau <kyron at neuralbs.com> wrote:
> > Rahul Nabar wrote:
> > On Wed, Sep 17, 2008 at 4:05 PM, Eric Thibodeau <kyron at neuralbs.com> wrote:
> > Well, I don't have "bondable" hardware so I'm really interested in how you
> > technically manage this one at the end.
> The more I do this, the more I get this uneasy feeling that this
> hasn't been done much before? :) Not too many guides that bond for
> bandwidth-aggregation. None at all for strict peer-to-peer
> bandwidth-aggregation. Am I trying to do the impossible?
> Most people seem to use bonding for fault tolerance or a one-to-many
> communication pipe.I really need more anecdotes and comments from
> other guys who successfully use bonding.
It is my understanding that 802.3ad forbids what you want to do:
running a single stream over more than one link; 802.3ad requires
that all packets are guaranteed to be delivered in order.
It is my impression that the standard was not written with HPC in
mind: it addresses the scenario of running many streams over a few
links, i.e., load balancing (and HA).
This does not mean that you cannot do what you want: you need to use
round-robin mode (which AFAIK is still the default under Linux;
easy to test with crossover cables).
- round-robin mode violates ("is an extension to") the 802.3ad
standard because it does not guarantee in-order delivery.
In my experience this is irrelavant in a cluster environment:
often a single switch, no multiple hops, no routers - out of
order delivery is very rare and has very little impact on
performance when using round-robin mode (we have done silly tests
like one host with 4 GigE interfaces, one with 3 and still got close
- most switch vendors do not support round robin mode - the only one
that I know who does is Extreme (please correct me!).
You can get around that problem by using a separate switch for each
leg, but that requires that each host has the same number of interfaces
for that bonded network. E.g., you cannot have a host with a single
10GigE card and another host with 4 1GigE cards.
Head, Research Computing
Simon Fraser University, Burnaby, British Columbia, Canada
More information about the Beowulf