[tulip] Question about RX-Drop errors and an appearant chipset lockup on the Phobos p430tx NIC.

Ben Greear greearb@candelatech.com
Sun Oct 20 16:44:01 2002


Donald Becker wrote:
> On Sun, 20 Oct 2002, Ben Greear wrote:
> 
>>Becker's driver dies almost immediately with these console errors:
> 
> 
> Which driver version are you using?  (0.95 or later is fine.)
> What is the detection message?

I just pulled it from your site, here is the version info:

Oct 20 12:44:20 demo2 kernel: tulip.c:v0.95 6/21/2002  Written by Donald Becker <becker@scyld.com>
Oct 20 12:44:20 demo2 kernel:   http://www.scyld.com/network/tulip.html
Oct 20 12:44:20 demo2 rc: Starting sshd:  succeeded
Oct 20 12:44:20 demo2 kernel: eth1: Digital DS21143-xD Tulip rev 65 at 0xce8bf000, 00:60:F5:07:03:D4, IRQ 11.
Oct 20 12:44:20 demo2 kernel: eth1:  EEPROM default media type Autosense.
Oct 20 12:44:20 demo2 kernel: eth1:  Index #0 - Media MII 100baseFx-FDX (#17) described by a 21140 non-MII (0) block.
Oct 20 12:44:20 demo2 kernel: eth1:  MII transceiver #1 config 3100 status 7829 advertising 01e1.
Oct 20 12:44:21 demo2 kernel: eth2: Digital DS21143-xD Tulip rev 65 at 0xce8c1000, 00:60:F5:07:03:D5, IRQ 10.
Oct 20 12:44:21 demo2 kernel: eth2:  EEPROM default media type Autosense.
Oct 20 12:44:21 demo2 kernel: eth2:  Index #0 - Media MII 100baseFx-FDX (#17) described by a 21140 non-MII (0) block.
Oct 20 12:44:21 demo2 kernel: eth2:  MII transceiver #1 config 3100 status 7829 advertising 01e1.
Oct 20 12:44:21 demo2 kernel: eth3: Digital DS21143-xD Tulip rev 65 at 0xce8cb000, 00:60:F5:07:03:D6, IRQ 7.
Oct 20 12:44:21 demo2 kernel: eth3:  EEPROM default media type Autosense.
Oct 20 12:44:21 demo2 kernel: eth3:  Index #0 - Media MII 100baseFx-FDX (#17) described by a 21140 non-MII (0) block.
Oct 20 12:44:21 demo2 kernel: eth3:  MII transceiver #1 config 3100 status 7829 advertising 01e1.
Oct 20 12:44:21 demo2 kernel: eth4: Digital DS21143-xD Tulip rev 65 at 0xce8cd000, 00:60:F5:07:03:D7, IRQ 5.
Oct 20 12:44:21 demo2 kernel: eth4:  EEPROM default media type Autosense.
Oct 20 12:44:21 demo2 kernel: eth4:  Index #0 - Media MII 100baseFx-FDX (#17) described by a 21140 non-MII (0) block.
Oct 20 12:44:21 demo2 kernel: eth4:  MII transceiver #1 config 3100 status 7829 advertising 01e1.


Another thing I noticed, rmmod fails (hangs) for both the kernel tulip and
your own tulip nic.  Yours printed out something about freeing an invalid
resource with some hex numbers.  I'll capture that next time I see it.

It hangs whether or not traffic has ever been attempted across the ports.

> 
> 
>>eth[1-4]:  Too much work during an interrupt, csr5=0xf06d80c0
> 
> 
> The receiver is out of buffers, which usually means that the kernel has
> run out of skbuffs.  The kernel then spends a whole bunch of time and
> cache misses trying to deal with no skbuffs, resulting in no CPU cycles
> to keep up with the interrupt work.
> 
> This should not halt operation, although it will cause packet drops.

Hrm, is there any way to reserve a very large number of skbuffs to make
this case less likely to hit?

Also, I tried the de4x5 driver...it pukes all kinds of stuff and doesn't work
at all.  Not sure if this driver is maintained anymore though:

Oct 20 13:22:46 demo2 kernel: eth4: DC21143 at 0xdc00 (PCI bus 2, device 7), h/w address 00:60:f5:07:03:d7,
Oct 20 13:22:46 demo2 kernel: eth4: Using generic MII device control. If the board doesn't operate,
Oct 20 13:22:46 demo2 kernel: please mail the following dump to the author:
Oct 20 13:22:46 demo2 kernel:
Oct 20 13:22:46 demo2 kernel: MII device address: 1
Oct 20 13:22:46 demo2 kernel: MII CR:  3100
Oct 20 13:22:46 demo2 kernel: MII SR:  7809
Oct 20 13:22:46 demo2 kernel: MII ID0: 13
Oct 20 13:22:46 demo2 kernel: MII ID1: 78e1
Oct 20 13:22:46 demo2 kernel: MII ANA: 1e1
Oct 20 13:22:46 demo2 kernel: MII ANC: 0
Oct 20 13:22:46 demo2 kernel: MII 16:  84
Oct 20 13:22:46 demo2 kernel: MII 17:  100
Oct 20 13:22:46 demo2 kernel: MII 18:  0
Oct 20 13:22:46 demo2 kernel:
Oct 20 13:22:46 demo2 kernel:       and requires IRQ5 (provided by PCI BIOS).
Oct 20 13:22:46 demo2 kernel: de4x5.c:V0.546 2001/02/22 davies@maniac.ultranet.com
Oct 20 13:22:47 demo2 kernel: eth1: Bad media code [17] detected in SROM!
Oct 20 13:22:47 demo2 kernel: eth1: media is unconnected, link down or incompatible connection.
Oct 20 13:22:47 demo2 kernel: eth1: Bad media code [17] detected in SROM!
Oct 20 13:23:15 demo2 last message repeated 1404 times


> 
> 
>>eth1: Restarted Rx at 2874 / 2874
>>eth3: Restarted Rx at 296 / 296
> 
> 
> This is the restart message after we have some memory for the receiver.

I would be willing to donate one or two of these NICs to you if that
would help you make them work well.  Since the Dlink 570tx cannot be
found anymore, I'm running out of options for good 4-port NICs!

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>       <Ben_Greear AT excite.com>
President of Candela Technologies Inc      http://www.candelatech.com
ScryMUD:  http://scry.wanfear.com     http://scry.wanfear.com/~greear