Confirmation of transmit timeout with 3c59x and v0.99F (2.1.120)

Ben Gertzfield che@debian.org
Tue Sep 8 19:14:55 1998


Today I compiled and installed 2.1.120 of the Linux kernel, which
comes with v0.99F of the 3c59x driver -- and suddenly my 3c59x card no
longer worked, reporting the following error and stalling on any TCP
sends/receives:

Sep  8 14:25:29 gilgamesh kernel: Socket destroy delayed (r=0 w=96) 
Sep  8 14:25:33 gilgamesh kernel: Socket destroy delayed (r=0 w=96) 
Sep  8 14:25:48 gilgamesh kernel: eth0: transmit timed out, tx_status 00 status e000. 
Sep  8 14:25:48 gilgamesh kernel:   Flags; bus-master 1, full 1; dirty 67 current 83. 
Sep  8 14:25:48 gilgamesh kernel:   Transmit list 00000000 vs. c0099240. 
Sep  8 14:25:48 gilgamesh kernel:   0: @c0099210  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   1: @c0099220  length 8000002a status 8000002a 
Sep  8 14:25:48 gilgamesh kernel:   2: @c0099230  length 8000002a status 8000002a 
Sep  8 14:25:48 gilgamesh kernel:   3: @c0099240  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   4: @c0099250  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   5: @c0099260  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   6: @c0099270  length 8000002a status 0000002
a 
Sep  8 14:25:48 gilgamesh kernel:   7: @c0099280  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   8: @c0099290  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   9: @c00992a0  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   10: @c00992b0  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   11: @c00992c0  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   12: @c00992d0  length 80000092 status 00000092 
Sep  8 14:25:48 gilgamesh kernel:   13: @c00992e0  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   14: @c00992f0  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel:   15: @c0099300  length 8000002a status 0000002a 
Sep  8 14:25:48 gilgamesh kernel: eth0: Resetting the Tx ring pointer. 
Sep  8 14:26:43 gilgamesh kernel: eth0: transmit timed out, tx_status 00 status e000. 
Sep  8 14:26:43 gilgamesh kernel:   Flags; bus-master 1, full 1; dirty 138 current 154. 
Sep  8 14:26:43 gilgamesh kernel:   Transmit list 00000000 vs. c00992b0. 
Sep  8 14:26:43 gilgamesh kernel:   0: @c0099210  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   1: @c0099220  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   2: @c0099230  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   3: @c0099240  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   4: @c0099250  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   5: @c0099260  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   6: @c0099270  length 80000092 status 00000092 
Sep  8 14:26:43 gilgamesh kernel:   7: @c0099280  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   8: @c0099290  length 8000002a status 8000002a 
Sep  8 14:26:43 gilgamesh kernel:   9: @c00992a0  length 8000002a status 8000002a 
Sep  8 14:26:43 gilgamesh kernel:   10: @c00992b0  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   11: @c00992c0  length 8000002a status 0000002a 
Sep  8 14:26:43 gilgamesh kernel:   12: @c00992d0  length 8000002a status 0000002a 

and so on forever. 

I've tried compiling with and without SMP; I've tried compiling as a
module and monolithically; I've tried adding "options 3c59x debug=1
options=0" to force 10baseT.

None of this has been working, except downgrading to v0.99E of the driver
(the version that comes with v2.1.119 of the Linux kernel).

I noticed that Piete Brooks also sent a report of this problem to the
list yesterday, but it hasn't been answered yet.

I looked at the diff between 0.99E and 0.99F, and I noticed a lot of
changes with autosensing of the media type -- could this be the
problem? My card is 100baseT, but the network is on a non-auto-sensing
10baseT hub. I'm not sure if this means anything, however.

This is what the kernel reports about my card upon bootup with v0.99F:

Sep  8 14:25:06 gilgamesh kernel: eth0: 3Com 3c905 Boomerang 100baseTx at 0xfcc0, 00:10:4b:36:86:20, IRQ 15 
Sep  8 14:25:06 gilgamesh kernel:   8K word-wide RAM 3:5 Rx:Tx split, autoselect/MII interface. 
Sep  8 14:25:06 gilgamesh kernel:   MII transceiver found at address 24, status 7869. 
Sep  8 14:25:06 gilgamesh kernel:   Enabling bus-master transmits and whole-frame receives. 
Sep  8 14:25:06 gilgamesh kernel: 3c59x.c:v0.99F 8/7/98 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html 

and this is what it reports with v0.99E (which still works):

Sep  8 15:57:10 gilgamesh kernel: eth0: 3Com 3c905 Boomerang 100baseTx at 0xfcc0, 00:10:4b:36:86:20, IRQ 15 
Sep  8 15:57:10 gilgamesh kernel:   8K word-wide RAM 3:5 Rx:Tx split, autoselect/NWay Autonegotiation interface. 
Sep  8 15:57:10 gilgamesh kernel:   MII transceiver found at address 24, status 7869. 
Sep  8 15:57:10 gilgamesh kernel:   Enabling bus-master transmits and whole-frame receives. 
Sep  8 15:57:10 gilgamesh kernel: 3c59x.c:v0.99E 5/12/98 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html 

Notice the difference -- 0.99F reports an MII interface, and 0.99E
reports an NWay Autonegotiation interface.

What is the difference between "autoselect/NWay Autonegotiation" and
"autoselect/MII" ? The former works; the latter does not.

The change in the code seems to be this, between 0.99E and 0.99F:

--- 3c59x.c     1998/08/07 20:09:10     1.1.1.1
+++ 3c59x.c     1998/09/08 16:23:02     1.1.1.2
@@ -918,7 +952,7 @@
                           config.u.ram_width ? "word" : "byte",
                           ram_split[config.u.ram_split],
                           config.u.autoselect ? "autoselect/" : "",
-                          config.u.xcvr ? "NWay Autonegotiation" :
+                          config.u.xcvr > XCVR_ExtMII ? "<invalid transceiver>" :
                           media_tbl[config.u.xcvr].name);
                vp->default_media = config.u.xcvr;
                vp->autoselect = config.u.autoselect;


Why is this change here? It definitely seems to break my system.

Is there anything else that needs to be reported about my system?

Ben

-- 
Brought to you by the letters V and H and the number 10.
"What's your order? I can SuperSize that." -- TMBG
Debian GNU/Linux -- where do you want to go tomorrow? http://www.debian.org/
I'm on FurryMUCK as Che, and EFNet and YiffNet IRC as Che_Fox.