Bug/weakness in tulip driver handling of long ethernet frames

Knut Omang knuto@scali.no
Thu Feb 17 05:47:58 2000


Included is a description we think might be useful,
of a problem we have been seeing on certain
systems with tulip compatible cards.

This particular card is reported as LiteOn LNE100TX (rev 32).
but similar situations have also been seen on systems with D-Link DFE530TX.


We could have formulated this as a patch but there might be issues with
other versions of the chipset that we are not aware of, hence this
description,

We got the following stack trace from debugging a system running the 2.2.13 kernel
with the kdb patch:

static const char version[] = "tulip.c:v0.89H 5/23/98 becker@cesdis.gsfc.nasa.gov\n";


[1]kdb> bt
    EBP       EIP         Function(args)
0xc8739ea4 0xc0115b8c  panic+0xfc( 0xc01e9a40, 0xd000fa52, 0x7044, 0x7044, 0xcf8d5440)
0xc8739ec0 0xc0153834  skb_under_panic( 0xcc266620, 0x7044, 0xd000fa52, 0x1, 0xc06ff800)
0xc8739f04 0xd000fa4f  tulip_probe+0x3a03( 0xcf8d5360, 0xcf8dbbc0, 0x4000001, 0x4, 0x34)
0xc8739f68 0xd000f48b  tulip_probe+0x343f( 0x10, 0xcf8d5360, 0xc8739fc4, 0x200, 0xcf8dbbc0)
0xc8739f88 0xc010b13c  handle_IRQ_event+0x58( 0x10, 0xc8739fc4, 0xcf8dbbc0)
0xc8739fa8 0xc0110027  do_level_ioapic_IRQ+0x63( 0x10, 0xc8739fc4, 0x449c8788)
0xc8739fbc 0xc010b2c3  do_IRQ+0x3b( 0x449c8788, 0x45692b08, 0x200, 0x20, 0x6c00)
0x0 0xc010a248  common_interrupt
[1]kdb> md 0xc01e9a40
c01e9a40: 75706b73 766f3a74 203a7265 253a7025  skput:over: %p:%
c01e9a50: 75702064 64253a74 76656420 0073253a  d put:%d dev:%s.
c01e9a60: 75706b73 6e753a74 3a726564 3a702520  skput:under: %p:
c01e9a70: 70206425 253a7475 65642064 73253a76  %d put:%d dev:%s
c01e9a80: 00000000 00000000 00000000 00000000  ................
c01e9a90: 00000000 00000000 00000000 00000000  ................
c01e9aa0: 7774654e 696b726f 6220676e 65666675  Networking buffe
c01e9ab0: 69207372 7375206e 20202065 20202020  rs in use       
[1]kdb> md ssci_debug


skb_under_panic == skb_over_panic   (because return value from call to
				    panic from skb_over_panic is the next instruction 
				    (which actually is the first instr. of skb_under_panic)
tulip_probe+0x3a03 == tulip_rx+0x223
tulip_probe+0x343f == tulip_interrupt+0xb7

eg. driver fails in the call to skb_put

Below is a dump of some of the rx descriptor buffers 
valid descriptors starting at 0xc06ff810 (length 16 bytes)
The 2.2.13 driver blows up when processing the descriptor at 0xc06ff940,

According to the tulip 21140 chipset documentation 
the status field value 0x704803c0 indicates a packet length of 0x3048 
and with the FF (Filtering Failed) bit set (bit 30)
and the CS and TL bits set (0xc0).

Neither the 2.2.13 kernel version (v089H) of the tulip driver nor the
2.2.14 kernel version (v0.91g) examines the FF-Filtering Fail, the TL-Frame
Too long or the CS-Collision seen bit (bits 30,6,7 of the RDES0 status
descriptor) which might be ok for the 2.2.14 case if the specification were correct?

The specification says that the ES bit (bit 15), which is checked by the 2.2.14
driver will be set if the CS bit is set, however, this is not the case in
the situation here,

The 2.2.13 driver does not examine the mentioned error bits and tries to use a 0x600 byte
long skb for a frame of length 0x7048, an error luckily caught by the
kernel with a panic.

In 2.2.14 this particular situation will probably be handled OK due to a
changed 

line 2840		if ((status & 0x38008300) != 0x0300) {

which will cause the packet to be thrown when it's size is larger than 2048
bytes. But we are not convinced that this will handle cases where the
length is "ok" but one of the additional error bits are set (FF, CS).
Wouldn't it be better to check for both FF, CS and FL explicitly, that way
the test in lines 2866 through 2873 might be unnecessary?

eg.

if ((status & 0x780083c0) != 0x0300) {

instead?


#ifndef final_version
			if (pkt_len > 1518) {
				printk(KERN_WARNING "%s: Bogus packet size of %d (%#x).\n",
					   dev->name, pkt_len, pkt_len);
				pkt_len = 1518;
				tp->stats.rx_length_errors++;
			}
#endif


[1]kdb> md 0xcf8d5360 
cf8d5360: cf8d5440 00000000 00000000 00000000  @T.O............
cf8d5370: 00000000 0000d000 00000010 00000001  .....P..........
cf8d5380: 00000001 00000000 00000000 00000000  ................
cf8d5390: 00000000 00000002 00000002 0000000b  ................
cf8d53a0: d000fd00 00000000 08e46ea3 08e46e8f  .}.P....#nd..nd.
cf8d53b0: 00001043 000005dc 000e0001 c06ff800  C...\........xo@
cf8d53c0: ffffffff 0000ffff f04800c0 0600cdb0  ..@.Hp0M..
cf8d53d0: cf8dbd80 00000001 00000000 00000000  .=.O............
[1]kdb> md cf8d5440
cf8d5440: 30687465 00000000 00000000 00000000  eth0............
cf8d5450: 00000000 00000000 00000000 00000000  ................
cf8d5460: 00000000 00000000 00000000 00000000  ................
cf8d5470: 00000000 00000000 00000000 cf8d5fc0  ............@_.O
cf8d5480: c0268f20 c0159fec c015a05c 00000000   .&@l..@\ .@....
cf8d5490: c02690c0 00000000 00000000 00000001  @.&@............
cf8d54a0: cf8d54a0 cf8d54a0 00000000 cf8d5360   T.O T.O....`S.O
cf8d54b0: 00000000 00000000 00000000 00000000  ................
[1]kdb> md 0xc06ff800
c06ff800: 00000000 00000000 00000000 00000000  ................
c06ff810: 004a0300 00000600 02b02010 006ff820  ..J...... 0. xo.
c06ff820: 00400320 00000600 0f8c6010 006ff830   .@......`..0xo.
c06ff830: 00400728 00000600 0f480810 006ff840  (.@.......H.@xo.
c06ff840: 00400728 00000600 0b27f810 006ff850  (.@......x'.Pxo.
c06ff850: 00400300 00000600 027b0810 006ff860  ..@.......{.`xo.
c06ff860: 00400728 00000600 08649010 006ff870  (.@.......d.pxo.
c06ff870: 00400728 00000600 05869010 006ff880  (.@..........xo.
[1]kdb> 
[1]kdb> md
c06ff880: 00400728 00000600 0eef0810 006ff890  (.@.......o..xo.
c06ff890: 00400728 00000600 023ab010 006ff8a0  (.@......0:. xo.
c06ff8a0: 00400728 00000600 0b213810 006ff8b0  (.@......8!.0xo.
c06ff8b0: 00400728 00000600 039d2010 006ff8c0  (.@...... ..@xo.
c06ff8c0: 00400728 00000600 0b212810 006ff8d0  (.@......(!.Pxo.
c06ff8d0: 00400728 00000600 0b863810 006ff8e0  (.@......8..`xo.
c06ff8e0: 00400728 00000600 056cf010 006ff8f0  (.@......pl.pxo.
c06ff8f0: 00400728 00000600 056ce010 006ff900  (.@......`l..yo.
[1]kdb> 
[1]kdb>md
c06ff900: 00400728 00000600 08648810 006ff910  (.@.......d..yo.
c06ff910: 00400728 00000600 01d2f810 006ff920  (.@......xR. yo.
c06ff920: 00400728 00000600 023aa010 006ff930  (.@...... :.0yo.
c06ff930: 00400728 00000600 0b213010 006ff940  (.@......0!.@yo.
c06ff940: 704803c0 00000600 027b1810 006ff950  @.Hp......{.Pyo.
c06ff950: 00460300 00000600 0f8c7810 006ff960  ..F......x..`yo.
c06ff960: 004a0300 00000600 0eef0010 006ff970  ..J.......o.pyo.
c06ff970: 004a0300 00000600 0b862810 006ff980  ..J......(...yo.


Regards,


Knut Omang, Ph.D.
Senior Software Architect, Scali AS Computer Systems /
Assistant Professor, Dep. of Informatics, University of Oslo, Norway
e-mail: knuto@scali.no / knuto@ifi.uio.no
Voice: +47 22 62 89 66 / +47 22 50 14 11 / +47 22 85 24 34
http://www.scali.com    Fax:   +47 22 62 89 51
http://www.ifi.uio.no:/~knuto/


-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-tulip-bug-request@beowulf.org