[eepro100] eepro100 driver frequently dying in Linux 2.4.4

Tim Cutts tim.cutts@incyte.com
Fri, 1 Jun 2001 16:21:49 +0100


On Mon, May 21, 2001 at 11:08:07AM -0400, Donald Becker wrote:
> On Mon, 21 May 2001, Tim Cutts wrote:
> > On Mon, May 21, 2001 at 06:03:21AM -0400, Kallol Biswas wrote:
> > > Hello Tim,
> > >          Just grab any program to update eeprom and turn off the sleep mode
> > > bit. If you don't find any I can write one for you. But it will be my first
> > > linux utility and I am also under high stress chasing these last minute
> > > bugs, hope I myself don't cause  master abort and return -ve(all ffff) :).
> > 
> > So, anyone on the list know of such a program?  I don't know the
> > physical details of updating eeprom, so I'm loath to write such a program
> > myself!
> 
> I've updated the eepro100-diag.c program to show the sleep mode bit.
>     eepro100-diag.c:v2.04 5/21/2001
> See the announcement that I'll send out in a minute or two.
> 
> The diagnostic does include code to write the EEPROM.  The write code
> modifies the mostly-harmless RPL parameter field, and recomputes the checksum.
> If someone reports an EEPROM with the sleep mode bit set, I'll make the
> trivial trivial code change to clear it with the '-G' option.

Interestingly, this program does not report that the sleep mode bit is
set, but I am still seeing the symptoms associated with that issue;
processes may hang in an uniterruptible wait state indefinitely under
high read load conditions.  Here's the output of eepro100-diag on one of
our STL2 motherboard machines, with the network interface taken down and
the eepro100 module removed from ther kernel:

eepro100-diag.c:v2.04 5/21/2001 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0x5400.
i82557 chip registers at 0x5400:
  00000000 00000000 00000000 00080002 182541e1 00000600
  No interrupt sources are pending.
   The transmit unit state is 'Idle'.
   The receive unit state is 'Idle'.
  This status is unusual for an activated interface.
EEPROM contents, size 64x16:
    00: d000 b7b7 a017 0d1b 0000 0201 4701 0000
  0x08: 0000 0000 48e0 1229 8086 0040 0000 0000
      ...
  0x30: 012c 4000 400c 0000 0000 0000 0000 0000
  0x38: 0000 0000 0000 4030 0000 0000 0000 9f98
 The EEPROM checksum is correct.
Intel EtherExpress Pro 10/100 EEPROM contents:
  Station address 00:D0:B7:B7:17:A0.
  Board assembly 000000-000, Physical connectors present: RJ45
  Primary interface chip i82555 PHY #1.

lspci -vv reports:

00:03.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100]
(rev 08)
        Subsystem: Intel Corporation: Unknown device 1229
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR+ FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
        Latency: 8 min, 56 max, 66 set, cache line size 08
        Interrupt: pin A routed to IRQ 18
        Region 0: Memory at fb101000 (32-bit, non-prefetchable)
        Region 1: I/O ports at 5400
        Region 2: Memory at fb000000 (32-bit, non-prefetchable)
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- AuxPwr- DSI+ D1+ D2+ PME+
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

for this device.

So, it doesn't look like the sleep mode bit is the problem here.

Incidentally, Donald's latest driver seems to behave better than any
other, but I am still seeing this problem even with that driver.

This is getting very frustrating - I have these four machines sitting
here virtually unusable because of this.  :-(  Should I just go out and
buy PCI network cards that actually work?  Any thoughts on possible
manufacturers to choose?

Tim.

-- 
Tim Cutts PhD                    Tel: +44 1223 454918
Incyte Genomics
Botanic House, 100 Hills Road, Cambridge, CB2 1FF, UK