[vortex] transmit timed out - IRQ conflict?

Martin Siegert siegert@sfu.ca
Thu, 21 Jun 2001 14:03:18 -0700


Hi there:

the problem that I am having may be related to the "NETDEV WATCHDOG:
eth0: transmit timed out" problem reported earlier, but nevertheless I still
don't know how to solve the problem.

This is a dual AMD box (kernel 2.4.5, otherwise RH7.1) with five 3Com NICs,
three of which are used in a channel-bonded configuration. I am using the
3c59x and bonding drivers that come with the 2.4.5 kernel.

# lspci
00:00.0 Host bridge: Advanced Micro Devices [AMD]: Unknown device 700c (rev 11)
00:01.0 PCI bridge: Advanced Micro Devices [AMD]: Unknown device 700d
00:07.0 ISA bridge: Advanced Micro Devices [AMD]: Unknown device 7410 (rev 02)
00:07.1 IDE interface: Advanced Micro Devices [AMD]: Unknown device 7411 (rev 01)
00:07.3 Bridge: Advanced Micro Devices [AMD]: Unknown device 7413 (rev 01)
00:07.4 USB Controller: Advanced Micro Devices [AMD]: Unknown device 7414 (rev 07)
00:08.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:09.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:0a.0 Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 30)
00:0e.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
00:0f.0 Ethernet controller: 3Com Corporation 3c980-TX [Fast Etherlink XL Server Adapter] (rev 78)
00:10.0 Ethernet controller: 3Com Corporation 3c980-TX [Fast Etherlink XL Server Adapter] (rev 78)

# ifconfig
bond0     Link encap:Ethernet  HWaddr 00:50:04:9B:30:E1  
          inet addr:172.16.0.1  Bcast:172.16.0.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2680666 errors:0 dropped:2653 overruns:0 carrier:0
          collisions:0 txqueuelen:0 

eth0      Link encap:Ethernet  HWaddr 00:01:02:60:16:30  
          inet addr:142.58.1.232  Bcast:142.58.1.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:83986 errors:0 dropped:0 overruns:0 frame:0
          TX packets:49387 errors:0 dropped:0 overruns:0 carrier:0
          collisions:1350 txqueuelen:100 
          Interrupt:10 Base address:0x1400 

eth1      Link encap:Ethernet  HWaddr 00:50:04:9B:30:E1  
          inet addr:172.16.0.1  Bcast:172.16.0.255  Mask:255.255.0.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:915955 errors:0 dropped:0 overruns:0 frame:0
          TX packets:894440 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:5 Base address:0x1480 

eth2      Link encap:Ethernet  HWaddr 00:50:04:9B:30:E1  
          inet addr:172.16.0.1  Bcast:172.16.0.255  Mask:255.255.0.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:915333 errors:0 dropped:0 overruns:104 frame:0
          TX packets:893120 errors:14 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:3 Base address:0x1800 

eth3      Link encap:Ethernet  HWaddr 00:50:04:9B:30:E1  
          inet addr:172.16.0.1  Bcast:172.16.0.255  Mask:255.255.0.0
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:915329 errors:0 dropped:0 overruns:106 frame:0
          TX packets:893106 errors:13 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:3 Base address:0x1880 

eth4      Link encap:Ethernet  HWaddr 00:E0:81:03:0F:7D  
          inet addr:172.17.0.1  Bcast:172.17.0.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:27 errors:0 dropped:0 overruns:0 frame:0
          TX packets:27 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          Interrupt:11 Base address:0x1c00 

For some reason eth2 and eth3 sharing the same IRQ 3 (eth1, eth2 and eth3
are channel-bonded). The problem occurs at high throughput (running netpipe
at block sizes of 2097149 bytes). The following errors appear on the console
and the connection freezes completely - I have to reboot the box to solve
the problem.

Jun 21 12:02:38 test2 kernel: NETDEV WATCHDOG: eth2: transmit timed out
Jun 21 12:02:38 test2 kernel: eth2: transmit timed out, tx_status 00 status e681.
Jun 21 12:02:38 test2 kernel:   diagnostics: net 0cd8 media 8880 dma 0000003a.
Jun 21 12:02:38 test2 kernel: eth2: Interrupt posted but not delivered -- IRQ blocked by another device?
Jun 21 12:02:38 test2 kernel:   Flags; bus-master 1, dirty 892903(7) current 892903(7) 
Jun 21 12:02:38 test2 kernel:   Transmit list 00000000 vs. df25e3c0.
Jun 21 12:02:38 test2 kernel:   0: @df25e200  length 8000005e status 0001005e
... (14 more lines of the same kind)
Jun 21 12:02:38 test2 kernel: NETDEV WATCHDOG: eth3: transmit timed out
Jun 21 12:02:38 test2 kernel: eth3: transmit timed out, tx_status 00 status e681.
Jun 21 12:02:38 test2 kernel:   diagnostics: net 0cc6 media 8880 dma 0000003a.
Jun 21 12:02:38 test2 kernel: eth3: Interrupt posted but not delivered -- IRQ blocked by another device?
Jun 21 12:02:38 test2 kernel:   Flags; bus-master 1, dirty 892902(6) current 892902(6)
Jun 21 12:02:38 test2 kernel:   Transmit list 00000000 vs. df25d380.
Jun 21 12:02:38 test2 kernel:   0: @df25d200  length 8000005e status 0001005e

Is there a way to force all five NICs to use different IRQs?
Or what else can I do to solve the problem?

Thanks for your help in advance!

Cheers,
Martin

========================================================================
Martin Siegert
Academic Computing Services                        phone: (604) 291-4691
Simon Fraser University                            fax:   (604) 291-4242
Burnaby, British Columbia                          email: siegert@sfu.ca
Canada  V5A 1S6
========================================================================