Autonegotiation endless loop

Donald Becker becker@cesdis.gsfc.nasa.gov
Fri Oct 22 17:25:51 1999


On Fri, 22 Oct 1999, William Montgomery wrote:
> On Fri, 22 Oct 1999, Donald Becker wrote:
> > On Fri, 22 Oct 1999, William Montgomery wrote:
> > 
> > > I am using Linux kernel version 2.0.37 with the tulip driver
> > > v0.91 dated 4/14/99.
> > > The problem comes about when I connect to a 100mbs ethernet hub.
> > > The autonegotiation seems sucessful, then something happens to 
> > > cause a link status interrupt and the autonegotiation starts all
> > > over again, it loops forever in this manner.
> > > eth0: 21143 100baseTx-FD link beat good.
> > > eth0: 21143 link status interrupt 41e192cf, CSR5 f8668000, fffbffff.
> > 
> > OOHHHHhhh, you lost link beat.
> > What is the timestamp on these events?.
..
> Since I produced the log with dmesg, no timestamps are present,
> however, I can extract the timestamps from /usr/adm/debug & 
> /usr/adm/messages if it would be helpful.  They appear to come in
> groupings no more than 1-2 seconds apart.

The timestamp would be useful.
I'm uncertain if we are seeing valid 100baseTx link beat, which then drops,
or if we are never establishing 100baseTx link beat.

The NWay spec says that once autonegotiation is complete the board must
switch to the negotiated speed in less than a second.  If no link beat is
detected within/after in one second, the autonegotiation process should
restart.

[[ Some people consider this slightly flawed.  Negotiation takes place with
10baseT signals, which will work with low-grade cable.  Perhaps if you don't
get stable 100baseTx link beat after a few tries, you should advertise only
10baesT in an attempt to at least minimally work.  But then you might be
stuck at 10baseT after a transient problem is fixed. ]]

> Also one more bit of info - the same chipset combination
> works just fine with a Pentium II 450MHz, I suspect some kind
> of timing problem.  Could it be that the PIII-500 writes some
> registers too fast?

Please confirm: the same board connected to the same switch works fine in a
slightly slower motherboard.  There are no timing loops, so that might
indicate a hardware problem (unrelated to the processor speed).

Donald Becker
Scyld Computing Corporation, and
USRA-CESDIS,   becker@cesdis.gsfc.nasa.gov