[Beowulf] Channel bonding, again
kilian at stanford.edu
Mon Oct 15 10:24:06 PDT 2007
On Sunday 14 October 2007 09:19:33 am Carsten Aulbert wrote:
> I don't know from the top of my head which versions we have used, but
> our problem with LACP was that the switch (mostly ProCurve 2900, but
> i think the Cisco 4948 behaved similarly, but that I would need to
> cross-check) was using only a single of the available two or four
> lines to the node for a single connection. Thus a node could handle
> two different 1 GB/s connections at the same time and reaching almost
> 2 Gb/s in total, but we never saw a single connection using all the
> available bandwidth.
> That was the reason our student came up with this VLAN trick.
Indeed, with a trunked LACP link, a single connection will only go over
one link. But you can have up to your-number-of-trunk-lines transfers
going wire-speed at the same time. I guess it all depends on what you
We're using LACP to aggregate links between our users and our cluster
\ trunk trunk
user -- | switch | ====== | firewall | ===== cluster
So in that setup, each user's individual connection is limited by its
own NIC (and often disk i/o), which is at most GigE. Our point is
letting more than only one user transfer data at Gbps speeds.
I guess that in our case, the VLAN trick couldn't really work, since,
if I understood correctly, the switch has to be the receiveing end for
the aggregagtion to work. For instance, the trunking host can send data
using several links, but it can only receive using one, because the
switch can't load balance and has to choose one interface/VLAN to send
data through, is that right?
I'm quite surprised balance-tlb could crash a node too, but I didn't try
More information about the Beowulf