[Beowulf] ethernet bonding performance comparison "802.3ad" vs Adaptive Load Balancing

Martin Siegert siegert at sfu.ca
Thu Sep 18 13:30:35 PDT 2008

On Thu, Sep 18, 2008 at 02:08:24PM -0500, Rahul Nabar wrote:
> On Wed, Sep 17, 2008 at 11:25 PM, Martin Siegert <siegert at sfu.ca> wrote:
> > It is my understanding that 802.3ad forbids what you want to do:
> > running a single stream over more than one link; 802.3ad requires
> > that all packets are guaranteed to be delivered in order.
> Yes, you are probably right Martin. I didn't know much about the
> 802.3ad at all until very recently. But the iffy point seems to be the
> definition of a "single stream" Does it mean "all traffic from a
> computer" or "all traffic from one process" or "all traffic from one
> protocol"

That ought to be configurable on the switch. But even the setting
"from one protocol" would not allow you to get more than 1Gbit/s
throughput for ssh, netpipe, MPI, etc., between two hosts since
it's always just a single protocol (port no.) that is involved.
> I guess that's the crucial question for my need right now.
> > This does not mean that you cannot do what you want: you need to use
> > round-robin mode (which AFAIK is still the default under Linux;
> > easy to test with crossover cables).
> Sure; mode_rr is always around. [I didn't get the test with the
> crossover cable though, sorry, could you explain perhaps? I'm not at
> all a networking guy  ]

Connect two hosts with two GigE interfaces each back-to-back using
crossover cables (not through a switch). Configure both to use
round-robin mode. You should get very close to 2Gbit/s.

> > - most switch vendors do not support round robin mode - the only one
> >  that I know who does is Extreme (please correct me!).
> That brings me to the other question though? Does mode_rr "need"
> switch-support? (ah! perhaps you mean I'll get assymetricity? Transmit
> side load balancing but no receive side load balancing because the
> switch will insist on sending all packets to a machine over a single
> port. Unless my machine answered ARP requests from different machines
> with alternate MACs of its cards and thus fools the switch into. But I
> guess that's what mode=6 does! So I'm not even sure how that is
> different from mode_rr. I'm sorry I'm confused again! ) Aren't many of
> the modes designed to operate inspite of an ignorant switch? I'm never
> sure which ones though!

Consider the case of host A with a 10GigE interface and host B
with 4 GigE interfaces (configured in round-robin mode) connected through
a switch. Run netperf from B to A and you should get close to (depending
on the quality of the hardware) 4Gbit/s. In this case the switch just
forwards all packets that come in through the 4 link aggregated ports
to the one 10GigE port (it can't really do anything else). However,
when you run netperf from A to B there is a single stream coming from
the 10GigE port. This stream will be sent to only one of the 1GigE
ports. Hence you get only 1Gbit/s. Unless your switch supports
round-robin mode on link aggregates. Even worse: this is still the
case when host A and B both have just two GigE interfaces. The
sending host will use both interfaces in round-robin mode, but the
switch still sees this as a single stream (same host (MAC address),
same protocol) and forwards the packets to only one of the two ports
of the receiving host.

> >  You can get around that problem by using a separate switch for each
> >  leg, but that requires that each host has the same number of interfaces
> >  for that bonded network. E.g., you cannot have a host with a single
> >  10GigE card and another host with 4 1GigE cards.
> >
> Ah! True. I do have two switches here and each of my nodes have two
> eth cards. So I guess I could do that too.

Yes, this sounds like a good idea ... initially ... until you get a 
host with a 10GigE card that you want to connect to the same network.
Also: try to netboot a host over such a network ...


More information about the Beowulf mailing list