[tulip] Re: eth0: transmit timed out

Jim McCloskey mcclosk@ling.ucsc.edu
Fri Oct 11 18:17:01 2002


I wrote:

|> Users experience impossibly slow response rates, and the message below
|> is repeated on the console and in the logs ad nauseam:
|>
|> Oct  7 21:30:34 localhost kernel: NETDEV WATCHDOG: eth0: transmit timed out
|> Oct  7 21:30:34 localhost kernel: eth0: Transmit timed out, status
|>     fc664010, CSR12 00000000, resetting...

Donald Becker wrote:
 
|> As I usually respond, the solution is
|>
|> mkdir /tmp/netdrivers/
|> cd /tmp/netdrivers/
|> ncftp ftp://ftp.scyld.com/pub/network/netdrivers.tgz
|> tar xfvz netdrivers.tgz
|> make
|> make install

Thank you very much for this help, and for all your work in these
areas.

I followed this advice:

Oct  9 14:35:42 localhost kernel: pci-scan.c:v1.10 7/13/2002  Donald Becker <becker@scyld.com> http://www.scyld.com/linux/drivers.html
Oct  9 14:35:42 localhost kernel: tulip.c:v0.95b 8/2/2002  Written by Donald Becker <becker@scyld.com>
Oct  9 14:35:42 localhost kernel:   http://www.scyld.com/network/tulip.html
Oct  9 14:35:42 localhost kernel: eth0: ADMtek Centaur-P rev 17 at 0xc881b000, 00:20:78:1F:1E:64, IRQ 10.
Oct  9 14:35:42 localhost kernel: eth0:  MII transceiver #1 config 3000 status 786d advertising 01e1.

and everything was fine for about 28 hours.  At that point, though,
net connectivity became unreliable:

Oct 10 18:40:31 localhost lpd[3623]: unable to get official name for remote machine panini.ucsc.edu

And by the following morning (by 8:00AM on October 11th) the interface
was still up, but the network was unreachable (the machine couldn't be
pinged and couldn't ping). Bringing the interface down and bringing it
back up once again `solved' the problem (everything seems still to be
fine, six hours later).

I've been all through the logs, and unfortunately there is not a trace
there of what the trouble might be or have been (unlike the timeout
messages that came so thick and fast with the earlier version).

I feel stupid writing in with so little information, but with nothing
in the logs, I don't know what I could usefully report.

Once again, this is kernel 2.4.18, hand-compiled from the downloaded
tarball.

tulip-diag shows:
----------------------------------------------------------------------
aptos# tulip-diag -maef
tulip-diag.c:v2.08 5/15/2001 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a ADMtek AL985 Centaur-P adapter at 0xd000.
ADMtek AL985 Centaur-P chip registers at 0xd000:
 0x00: fff98000 ffffffff ffffffff 07c83000 07c83200 fc264010 ff972117 ffffebff
 0x40: fffe0000 fff597f8 00000000 fffe0000 00000000 00000200 00000000 c40ffec8
 Extended registers:
 80: 00264010 03fe6bff a04c0005 ffffffff 00000000 07c83250 07c83020 ffe0f000
 a0: f0000000 1f782000 ffff641e 00000000 40000000 00000000 00000000 00000000
 c0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
 e0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 20000027
 Comet duplex is reported in the MII status registers.
 Transmit started, Receive started, half-duplex.
  The Rx process state is 'Waiting for packets'.
  The Tx process state is 'Waiting for Tx to finish'.
  The transmit threshold is 128.
  Comet MAC address registers 1f782000 ffff641e
  Comet multicast filter 0000000040000000.
EEPROM 256 words, 8 address bits.
  Ethernet MAC Station Address 00:20:78:1f:1e:64.
  Default connection type 'Autosense'.
  PCI IDs Vendor 1317 Device 0985  Subsystem 1317 0574
  PCI min_grant 255 max_latency 255.
  CSR18 power-up setting 0xa04c****.
 MII PHY found at address 1, status 0x786d.
 MII PHY found at address 2, status 0x786d.
 MII PHY found at address 3, status 0x786d.
 MII PHY found at address 4, status 0x786d.
----------------------------------------------------------------------

Any help or advice would be greatly appreciated,

Jim