[tulip] Re: True on TRANSMIT ERROR TIMEOUT

Bogdan Costescu Bogdan.Costescu@IWR.Uni-Heidelberg.De
Wed, 14 Jun 2000 16:32:56 +0200 (CEST)


On Tue, 13 Jun 2000, Andrew Morton wrote:

> And there are many, many times when a wedged driver can be resurrected
> by a down/up or a rmmod/insmod.  This means that the driver _could_ have
> automtically recovered in tx_timeout, but it simply did not do so.

I beg to disagree. The above mentioned operations are doing much more than
handling TX timeouts: register/unregister IRQ, get/release memory, set up
(from scratch) the Tx and Rx buffers, media selection... The tx_timeout
routine should only recover from a Tx timeout! What you propose is
something like calling xxx_open() from tx_timeout... if I understand it
right.

> Can anyone suggest a reason why we _shouldn't_ simply reset the NIC to
> the utmost possible extent in tx_timeout?  Restart media selection,
> reinitialise ring buffers, etc, etc?

Simply because you need to know the exact state of the card in order to
save the relevant parts of it, reset and then load the card with
the previous values (of course, you don't need to re-load the parts
which created the problem). Think of the 3c59x driver: you might want to
save the Tx/Rx threshold related values, poll interval values and so on.
This is only efficient if you keep most of them to the default (as
power-on) values.
For media selection, xxx_timer should take care of this, there should be
no need for tx_timeout to handle media changes. When the transmission is
stopped because of a media change, the xxx_timer routine should take care
of the media state, then tx_timeout routine should reset the transmitter
and everything should be working again. Also adding media related logic to
tx_timeout raises the problem of protecting the access to the media
related registers WRT xxx_timer...

I don't think that flushing the queue (as somebody else from this thread
suggested) is a good ideea as you loose a lot of packets (usually 16). And
usually the buffers themselves have no relation with the
transmission/reception logic - you might need to restart the transmitter
and/or the DMA engine and maybe write again the head buffer to it -
that's all.

Now, coming back to the initial message about "correlated" reports of Tx
timeouts using some boards/drivers, I don't think this is true. Most of
the problems are in fact related to media selection (when autonegotiation
fails and the boards cannot transmit properly) and real timeouts (e.g.
caused by collisions) that are board specific. In this case, there is no
general rule that has to be applied - the hardware has to be driven "The
Right Way" (which might not be even documented).

Sincerely,

Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De