> trunking, link aggregation, channel bonding, are all the same.  i think
> the best method is to use 2 switches. the problem that i've seen on both
> cisco and intel switches is that the recv streams are not divided between
> the 2 recv'ing interfaces.  so don't expect to see ~180 mbits/sec
> between a point-to-point sender/receiver application if you're using the
> switch's trunking capability,  you'll need multiple streams to get that.

Sorry to dissapoint you, but I got this working. My BayNetworks 350-24T
sends packets in the Round-Robin fashion and I was able to get more than
180 Mbit/s (measured with ttcp) with 2 3C905C cards in each node.
One Cisco switch (that I have used but I don't administer) gave me at the
beginning strange results. ifconfig would show for my 2 bonded NICs the
same number of Tx packets, but very unbalanced Rx packets. Empirically (by
looking at the LEDs on the back of the cards) I found that only one link
was used normally and only if this link was busy, the second would be
used. However, after I talked to the admin, I got "normal" results from
ifconfig, showing very similar numbers for Rx packets; I haven't asked
him again, but I guess that he changed something in the switch config (as
the traffic didn't change). However, I'm not sure if this can change the
bandwidth results.

> but if you can wire your cluster so that interface0 goes to one switch and
> interface1 goes to another switch then you could get good point-to-point
> bandwidth.

Using 2 different switches is one approach, but only solves one class of
problems. For example, if you need 3 NICs/nodes, you need 3 switches and
so on; the wiring might become problematic. On the other hand, there are
applications which require only one node (let's call it master) to
send/receive large amounts of data, while the computing nodes handle
amounts of data which can "fit" into normal (single NIC) bandwidth. This
problem can easily be solved by having a switch which supports
channel-bonding/trunking/link-aggregation to which the master will be
linked through bonded links, while the compute nodes will be able to use
just 1 NIC. (and this applies just as well to a NFS server).


