[eepro100] auto-negotiation | force fixed

Frank Lenaerts Frank Lenaerts <lenaerts.frank@pandora.be>
Wed Oct 30 20:11:00 2002


--V0207lvV8h4k8FAm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi,

I encountered a really strange situation today. I am not sure if this
is the right place to ask, but I'll give it a try (it deals with
eepro100, mii-diag output interpretation, auto negotiation). The setup
looks as follows:

              0     1              0     1
--- switch -+- hostA -+- switch --- hostC
            | 0     1 |
            +- hostB -+

Each switch is a 3Com Office Connect (10/100Mbps). All hosts (A, B and
C) have 2 NICs: eth0 (indicated by 0) is a Myson MTD803 using the
fealnx driver, eth1 (indicated by 1) is an Intel EEPro 100 using the
eepro100 driver (delivered with the kernel source tree). [HostC
currently only uses eth0. HostA and hostC are used most of the
time. None of the hosts has a keyboard attached (console on serial
port).]

All hosts seem to report an error message on eth0 about 3 times a day:

- eth0: Transmit timed out, status 00000000, resetting...
  Rx ring d6ba2000:  80000000 80000000 80000000 80000000 80000000
  80000000 80000000 80000000 80000000 80000000 80000000 80000000=20
  Tx ring d6beb000:  0000 80000000 80000000 80000000 80000000 80000000

Although there is almost no information about this error message with
the type of NIC I am using, it seems that this message occurs when
large amounts of data are transmitted.

- eth1: card reports no resources

There is even less information available about this error message. The
only relevant thing I could find is
http://www.faqchest.com/linux/KERNEL/kern-00/kern-0010/kern-001097/kern0010=
3110_16525.html=20

This error message seems to occur when some other NIC (eth0) (a) on
another host in the network is resetting e.g. when eth0 on hostC
resets, there is a big chance (99 percent) that eth1 on hostA and
hostB report an error on eth1 or (b) on the localhost is resetting.

To me, this means that the error on eth1 only occurs because of the
error message on eth0.=20

Now the interesting part (today's situation):

- I noticed that I could not reach any of the hosts anymore

- both of the switches were going mad (all leds flashing very very
  fast)

- switched off both of the switches and switched them on again: no
  help=20

- switched off hostC and switched it on again: everything ok

After checking the logfiles, it seemed that:

- hostB had been quiet

- hostC had been quiet

- hostA had been trying to reset eth0 each 4 seconds, during more than
  2 hours; after these 2 hours, eth1 started to show the error message

  To me, this seems like hostA has been killing both
  switches. However, to cleanup the "broken" situation, I switched off
  hostC (as I first thought this one would be guilty) ... .

Finally, I checked the interfaces (using myson-diag -m and
eepro100-diag -m -f). At first sight, everything seems ok, although
the eepro100 card seems to advertise Flow-control while the fealnx
(myson) does not. The vendor-specific registers for the Myson card
however, show something strange: is this NIC working at 10baseT
instead of 100baseTx-FD!?=20

--- begin myson ---
myson-diag.c:v1.00 5/15/2001 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Myson MTD803 adapter at 0xb800.
 Station address 00:02:44:63:00:02.
  Receive mode is 0x80f48e61: Normal unicast and hashed multicast.
 This device appears to be active, so some registers will not be read.
 To see all register values use the '-f' flag.
  No interrupt sources are pending (0000).
 MII PHY #32 transceiver registers:
   3000 786d 0302 d000 41e1 45e1 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0602 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000.
 MII PHY #32 transceiver registers:
   3000 786d 0302 d000 41e1 45e1 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0602 0000 0000 0000 0000 0000 0000 0000
   0000 0000 0000 0000 0000 0000 0000 0000.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 Basic mode status register 0x786d ... 786d.
   Link status: established.
   Capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Vendor ID is 00:c0:b4:--:--:--, model 0 rev. 0.
   Vendor/Part: ASIX (unknown type).
 I'm advertising 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT
   Advertising no additional info pages.
   IEEE 802.3 CSMA/CD protocol.
 Link partner capability is 45e1: Flow-control 100baseTx-FD 100baseTx 10bas=
eT-FD 10baseT.
   Negotiation  completed.
  TDK format vendor-specific registers 16..18 are 0x0602 0x0000 0x0000
      Link polarity is detected as normal.
     100baseTx Coding and scrambling is disabled!
      Auto-negotiation complete, 10Mbps half duplex.
      Rx link in fail state, PLL locked.
  10baseT loopback mode.
      No new link status events.
---  end myson  ---

--- begin eepro100 ---
eepro100-diag.c:v2.05 6/13/2001 Donald Becker (becker@scyld.com)
 http://www.scyld.com/diag/index.html
Index #1: Found a Intel i82557/8/9 EtherExpressPro100 adapter at 0xb400.
 MII PHY #1 transceiver registers:
  3000 782d 02a8 0154 05e1 45e1 0001 0000
  0000 0000 0000 0000 0000 0000 0000 0000
  0a03 0000 0001 0000 0000 0000 0000 0000
  0000 0000 0b20 0000 0010 0000 0000 0000.
 MII PHY #1 transceiver registers:
   3000 782d 02a8 0154 05e1 45e1 0001 0000
   0000 0000 0000 0000 0000 0000 0000 0000
   0a03 0000 0001 0000 0000 0000 0000 0000
   0000 0000 0b20 0000 0010 0000 0000 0000.
 Basic mode control register 0x3000: Auto-negotiation enabled.
 Basic mode status register 0x782d ... 782d.
   Link status: established.
   Capable of  100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   Able to perform Auto-negotiation, negotiation complete.
 Vendor ID is 00:aa:00:--:--:--, model 21 rev. 4.
   No specific information is known about this transceiver type.
 I'm advertising 05e1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD 10bas=
eT
   Advertising no additional info pages.
   IEEE 802.3 CSMA/CD protocol.
 Link partner capability is 45e1: Flow-control 100baseTx-FD 100baseTx 10bas=
eT-FD 10baseT.
   Negotiation  completed.
---  end eepro100  ---

Because the two switches were going mad, I am thinking the problem has
to do with the auto-negotiation process when the card is reset.

Would it help if I forced each of the NICs to 100baseTx-FD (of
course, the problem with the Myson cards resetting would still be
there, but at least the switches and the Intel cards should be all
right)?

--=20
lenaerts.frank@pandora.be

Those who do not understand Unix are condemned to reinvent it, poorly."
-- Henry Spencer


--V0207lvV8h4k8FAm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE9wFWFuSr0q5/7NwcRAo9yAJ981zsYRXgNu48o0+4eMJGRYBwFYACfZi4A
0ylMGwi9HE9W1F/xHVratKY=
=v1xO
-----END PGP SIGNATURE-----

--V0207lvV8h4k8FAm--