Channel bonding: working combinations ?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Martin Siegert siegert at sfu.caTue Jan 23 12:44:46 PST 2001
- Previous message: Channel bonding: working combinations ?
- Next message: Channel bonding: working combinations ?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi Daniel, I had none of your problems when using the DFE570Tx with the tulip driver (see the other post). I actually never had to use ifenslave/ifconfig manually, the configuration comes up reliably after rebooting or when running "/etc/rc.d/init.d/network restart". Hence I can only guess where your problems may be: 1. I trust that you have the line "alias bond0 bonding" in your /etc/conf.modules (or /etc/modules.conf, whatever you are using) file. (sounds stupid, but I made that mistake once). 2. You mentioned that you use eth0 for a different network. Is it using the same driver as the other cards? If it is: how do you tell which card your machine is recognizing as eth0? This happened to me over and over again: if you plug in a second NIC you cannot be sure that the new card will be eth1 - it may just as well be eth0 and the old card may come up as eth1, creating nothing but problems. The only way I found to figure this out is to run ping on the network that is connected to eth0 and look which card has flashing lights (and then swap cards). I hope this helps. Cheers, Martin ======================================================================== Martin Siegert Academic Computing Services phone: (604) 291-4691 Simon Fraser University fax: (604) 291-4242 Burnaby, British Columbia email: siegert at sfu.ca Canada V5A 1S6 ======================================================================== On Mon, Jan 22, 2001 at 08:58:55AM +0100, Pfenniger Daniel wrote: > > I am trying to install channel bonding on our cluster, but I meet a > few problems that may interest people on the list. > > Linux kernel: 2.2.18 or 2.4.0, compiled with gcc 2.95.2, (RedHat 6.2) > Motherboard: ASUS P2B-D (BX chipset) > Procs: Pentium II 400 dual > Ethernet cards: with the tulip chips DS21140 and DS21143. They work well > when not bonded. > Switches: 2 Foundry FastIron II > Drivers: tulip.o, or old_tulip.o as modules supplied with the official kernel > Documentation: in /usr/src/linux-2.2.18/Documentation/networking/bonding.txt > (BTW this file is not provided in kernel 2.4.0) > > I have strictly followed the indications in bonding.txt > Every card has a distinct IRQ. > > The first problem is that ifconfig bond0 does not find any hardware > or IP address at boot or interactively (they are zero). > I can persuade an hw address by giving it manually: > > ifconfig bond0 192.168.2.64 hw ether 00:40:05:A1:D9:09 up > > Here I don't know how to automatically force the hw address in the > ifcfg-bond0 file. > > Incidentally there are a few different versions of ifenslave.c on the net > with the same version number (v0.07 9/9/97 Donald Becker > (becker at cesdis.gsfc.nasa.gov)). > I have taken the version included with the bonding-0.2.tar.gz tarball. > > By manually starting channel bonding I get (eth0 is assigned to another > network): > > bond0 Link encap:Ethernet HWaddr 00:40:05:A1:D9:09 > inet addr:192.168.2.64 Bcast:192.168.2.255 Mask:255.255.255.0 > UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1 > RX packets:108 errors:38 dropped:0 overruns:0 frame:0 > TX packets:6 errors:5 dropped:0 overruns:0 carrier:15 > collisions:0 txqueuelen:0 > > eth1 Link encap:Ethernet HWaddr 00:40:05:A1:D9:09 > inet addr:192.168.2.64 Bcast:192.168.2.255 Mask:255.255.255.0 > UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 > RX packets:108 errors:0 dropped:0 overruns:0 frame:0 > TX packets:6 errors:0 dropped:0 overruns:0 carrier:0 > collisions:0 txqueuelen:100 > Interrupt:18 Base address:0xb800 > > eth2 Link encap:Ethernet HWaddr 00:40:05:A1:D9:09 > inet addr:192.168.2.64 Bcast:192.168.2.255 Mask:255.255.255.0 > UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 > RX packets:0 errors:38 dropped:0 overruns:0 frame:0 > TX packets:0 errors:5 dropped:0 overruns:0 carrier:15 > collisions:0 txqueuelen:100 > Interrupt:17 Base address:0xb400 > > Then a ping to another such bonded node may produce different things: > - a complete freeze, reset required. > - ping waits, ctrl-c stops it. > - ping works, with almost double speed > > When ping works netperf -H node may either be almost twice as fast (175 Mb/s) > as single channel communications (94 Mb/s), or much slower (10, 25 Mb/s), > despite ping indicating improved communication time. > > In conclusion channel bonding with such a configuration appears unreliable. > > Since several messages have been posted on this list stating problems, > as well as on the tulip list about tulip drivers, with the present channel > bonding capability of the Linux kernel, it could be useful if people with > working combinations of kernel (is 2.2.17 better), NIC/driver (which tulip > version), etc, could share their detailed working specs. > I am sure this would be much appreciated by those wanting to bond their Beowulf.
- Previous message: Channel bonding: working combinations ?
- Next message: Channel bonding: working combinations ?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
