[eepro100] Transmitter Timeout -- addednum

Kallol Biswas kallol@bugula.fpk.hp.com
Tue, 01 Aug 2000 10:54:44 EDT


> 
> Hmmm, the documentation does states that the chip will examine the 'S' bit
> of the i+1 command, but this is only an optimization and should not impact
> correct operation.  If the suspend bit is set on the current i'th command
> (as it will be if the subsequent command is still being constructed), the
> chip will suspend.  It will then re-read the i+1'th command at the next
> RESUME command.


Section #6.5.3.2 CU_RESUME

 "If the CU is in active state it will check the validity of the S-bit
in the current and next action command. If the S-bit is cleared in the
current CB, it will proceed to the next CB in the list after executing the
current CB."

The other case as I told earlier:
      The card prefetches old next cmd word, goes to suspended state, the
driver updates this old next cmd word, and sends a CU_RESUME, upon receiving 
the resume the card executes the old cmd word and stalls. We would
not have this problem if all the cmd words in the ring were only
TX commands, as a prefetched command would always be a TX cmd.


I am not sure if Intel's driver uses NOP or has separate TX and
non-tx command ring, but I saw some driver would use NOP.
If I get some free time I will definitely read existing drivers.



> 
> > cmd has solved the problem. Now our stress tests run for days without
> > any problem on 82559. 
> 
> Are you certain that you are not seeing the CU_RESUME command by-pass the
> next descriptor initialization that is still sitting in a write buffer?

The card prefetches only the next command word not the parameters.

       --------------------------------------
       |         CMD/STATUS word            |
       |-------------------------------------
       |         Link Address               |
       |-------------------------------------
       |  parameters                        |

i.e. the CMD/Status word.


> 
> > > v1.06 of the driver seemed to handle the TX timeouts a quicker then
> > > v1.09, but in v1.09 they were less frequent.  I tried to compile v1.10
> > > and experimental v1.11, but I got all types of compile errors and did
> > > not have the motivation to port them to v2.2.16 of the kernel after all
> > > my above failures.
> 
> It's not difficult: just read
>    http://www.scyld.com/network/updates.html
> 
> I converted my drivers to pci-scan and kern_compat.h at the request of Linus
> for no backward-compatible code in the driver.  It has turned out to be a
> big support problem -- the previous method of everything in a single *.c
> file is much easier for users.
> 
> > > I have NO IDEA what is causing these TX timeouts. . . if any of the
> > > gurus here would be as kind as to aide me in my efforts to figure this
> > > out, I would greatly appreciate it!  I will grant accounts on the
> > > troublesome machine if that will aide in trouble-shooting, and I will
> > > code whatever I can if anyone can give me a direction to go in. . . 
> 
> Please try v1.10 or v1.11.  It should fix the problem.
> 
> Donald Becker				becker@scyld.com
> Scyld Computing Corporation		http://www.scyld.com
> 410 Severn Ave. Suite 210		Beowulf Clusters / Linux Installations
> Annapolis MD 21403
> 
> 
> 


--
Phone: 973-443-7469
Telnet: 1-443-7469
www.kallolbiswas.com
kallol_biswas@hp.com