3x100Mbps bonding question
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Laurent Itti itti at cco.caltech.eduWed Oct 4 00:31:06 PDT 2000
- Previous message: I'm really sorry for html
- Next message: ACM/IEEE TCCC CoRR Cluster Computing Archive: call for articles (published/unpublished)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi - about a month ago I asked for some design tips on specifying a small 16-node cluster with 3x100Mbps network. It's here! Thanks again to all of you who helped! So far, installation is going well except for some trouble with the ether channel bonding. - 3 ether cards per box, all RTL8139 - all eth0 go to 1 switch, all eth1 to another switch, all eth2 to third switch. No connection between switches. - somehow, even if I disable serial, parallel, sound, etc in the bios, it insists on sharing IRQs among several devices, leaving plenty of low IRQs free (e.g., put 2x ether on IRQ 11 and IDE+ether+VGA on IRQ 15 but nothing on 3, 5, 7, etc). It works fine, but naively I would think that trying to assign 1 IRQ/nic would give better performance? is that true? any tip on how to achieve that? (Abit SE6 i815 motherboards with award bios). - configuring the 3 NICs on separate subnets works great (all eth0 on 192.168.0.x, all eth1 on 192.168.1.x, all eth2 on 192.168.2.x). However, the machines get ultra sluggish when doing massive simultaneous transfers on all 3 NICs. I guess that's probably related to the window size that is too small, and we get flooded with too many IRQs? Any tip on how to change that would be greatly appreciated! - bonding so far not working ;-( I create an ifcfg-bond0 script with the IP and related info; then (I am using Mandrake 7.2beta3) just configure the ifcfg-ethX with SLAVE=yes, MASTER=bond0 (plus other stuff as per documentation in the kernel tree), and add aliases in /etc/modules.conf, and there we go. It all seems to enslave and bond fine. ifconfig gives expected results (all have same IP & MAC addresses, etc). the only thing that could seem strange is that I have 4 routes for my local subnet, with devices bond0, eth0, eth1, eth2 (in that order). "route del -net 192.168.0.0 dev eth0" did not work, so I assumed the duplicate routes were ok? I have no default route and no gateway in any of the routes (so far). pinging another bonded machine, I can see the LEDs on my switches flashing in sequence (1 packet on eth0, then one on eth1, etc). That looks like the round-robin distribution of packets that I read about. Only problem is that only eth0 replies (doing tcpdump on bond0 while also looking at the switches: only when the switch connected to eth0 blinks do I get a reply). The other traffic I see are ARP requests, all for the eth0 MAC addresses, but never for the other ones. did I forget anything obvious? maybe put all the MAC addresses in /etc/ethers? or is there some kernel feature that could be compiled in the stock kernel and would cause that? do I need to do any configuration on the switches? in my naive view (officially, I am a neuroscientist, so please forgive me), if I send out on eth1 a packet for a bond0 mac address (equal to the corresponding eth0 mac address), I would not expect switch1 to know that in fact that packet should go to the eth1 mac address associated with the bond0 device I am trying to send to? (yet the LEDs do blink by pairs - source and destination ports, so I must just be very confused on that issue). any suggestion/comment appreciated, and I'll keep experimenting! thanks! -- laurent
- Previous message: I'm really sorry for html
- Next message: ACM/IEEE TCCC CoRR Cluster Computing Archive: call for articles (published/unpublished)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
