Bouncing tulip bug? Or failing hw?

phil@Stimpy.netroedge.com phil@Stimpy.netroedge.com
Wed Apr 5 16:30:24 2000


Hello, I've got a firewall with a couple Tulips on kernel 2.2.14.  One
is connected (via crossover) to a 10Base-T bridge and the other is
connected to a (Cisco Catalyst 2924) 10/100 switch (sorry for wrapped
lines):

[...]
Apr  5 13:20:11 fw1-out kernel: tulip.c:v0.91g 7/16/99
 becker@cesdis.gsfc.nasa.gov 
Apr  5 13:20:11 fw1-out kernel: eth1: Digital DS21140 Tulip rev 34 at
 0x7400, 00:00:94:83:AA:F4, IRQ 5. 
Apr  5 13:20:11 fw1-out kernel: eth1:  EEPROM default media type
 10baseT. 
Apr  5 13:20:11 fw1-out kernel: eth1:  Index #0 - Media 10baseT (#0)
 described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth1:  Index #1 - Media 100baseTx (#3)
 described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth1:  Index #2 - Media 10baseT-FD
 (#4) described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth1:  Index #3 - Media 100baseTx-FD
 (#5) described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth1:  MII transceiver #1 config 3100
 status 786f advertising 01e1. 
Apr  5 13:20:11 fw1-out kernel: eth2: Digital DS21140 Tulip rev 34 at
 0x7000, 00:00:94:A4:34:36, IRQ 10. 
Apr  5 13:20:11 fw1-out kernel: eth2:  EEPROM default media type
 10baseT. 
Apr  5 13:20:11 fw1-out kernel: eth2:  Index #0 - Media 10baseT (#0)
 described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth2:  Index #1 - Media 100baseTx (#3)
 described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth2:  Index #2 - Media 10baseT-FD
 (#4) described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth2:  Index #3 - Media 100baseTx-FD
 (#5) described by a 21140 non-MII (0) block. 
Apr  5 13:20:11 fw1-out kernel: eth2:  MII transceiver #1 config 1000
 status 782d advertising 01e1. 
[...]

The one connected to the switch (eth1) will occationally (maybe once
every couple of days?) start 'bouncing' between 100 and 10 Mb: 

Apr  5 13:01:33 fw1-out kernel: eth1: transmit timed out, switching to
 100baseTx-FD media. 
Apr  5 13:01:38 fw1-out kernel: eth1: 21140 transmit timed out, status
 fc6988c7, SIA ffffff1b ffffffff 1c09fdc0 fffffec8, resetting... 
Apr  5 13:01:38 fw1-out kernel: eth1: transmit timed out, switching to
 10baseT media. 
Apr  5 13:01:43 fw1-out kernel: eth1: 21140 transmit timed out, status
 fc6988c7, SIA ffffff0b ffffffff 1c09fdc0 fffffec8, resetting... 

These lines repeat over and over, and Eth1 remains down.

This condition can be (at least temporarily) fixed by doing a warn
restart on the machine (shutdown -r now).

Is this is problem (you suppose) with the hardware failing?  Or, a bug
in the driver?  Or can I possibly solve this by 'fixing' the ethernet
speed of the card to 100 or 10 Mb?  I read some discussion related to
specific issues with Tulips and kernel 2.2.14.. do you suppose this is
an effect of that?

The machine has been doing well for quite a while (at least several
weeks anyway), and I've begun to experience this only in the last week
or so.. which leads me to believe that there is a hardware problem.

I was going to solve this by buying a new nic (like an Intel one), but
I figured I'd fire off an email to the mailing list to see what you
folks think.

Thanks!


Phil

-- 
Philip Edelbrock -- IS Manager -- Edge Design, Corvallis, OR
   phil@netroedge.com -- http://www.netroedge.com/~phil
 PGP F16: 01 D2 FD 01 B5 46 F4 F0  3A 8B 9D 7E 14 7F FB 7A
-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-tulip-bug-request@beowulf.org