DnStall period

Donald Becker becker@scyld.com
Sat May 13 02:00:26 2000


On Sat, 13 May 2000, Andrew Morton wrote:

> My test setup of several weeks ago showed this nicely.  I had four
> machines on a hubbed 10bT LAN:
..
> Machine B: 3c575
> Machine C: 3c905B
...
> machine C is the test machine.
>
> With this setup I was seeing the wait_for_completion() function fall
> through after 2,000 loops several times a minute.  I was also frequently
> seeing the maxCollisions threshold exceeded.

The 3c905B is a Cyclone, and thus doesn't use the DownStall method of
queuing packets.  That's only used on the Boomerang.

Do you mean that you were measuring on the 3c575?

> In all cases, txfree has a very small value: 0x8, 0xc, etc.  This means
> that the Tx FIFO is almost full.
> 
> So, my theory:
> 
> - The NIC has started to transmit a packet.
> - The next packet in memory is, say, 64 bytes.
> - The NIC sees >32 bytes spare in the FIFO (but <64).

This doesn't happen on the Boomerang.  The TxFreeThreshold is set to 1536,
which assures us that a packet transfer will not start until there is room
in the FIFO.

> - The NIC transfers 32 bytes from main memory.
> - Collisions start happening, and force the NIC to
>   resend the current packet an arbitrary number of times.
> - During this process, we issue a DnStall.
> 
> The NIC simply has nowhere to go.  It can't honour the DnStall because
> it's halfway through processing a DPD.  It can't free up room in the
> FIFO because it has to hang onto the head packet for retransmission.
> 
> I would like to be able to query the NIC's current internal DMA address
> pointer.  Can't see a way of doing this.
> 
> I would like to know what DMA burst sizes the NIC is using.  Can't see
> any reference to this.  Is this a PCI thing?

Partially in the PCI config register, partially in registers added on the
3c905B.  It's not as orthoganal as on other chips.

> Bogdan,  I am at a loss to explain why increasing the loop count from
> 2,000 to 4,000 changed anything for you.   You're on switched 100bT,
> right?  You shouldn't be getting _any_ collisions (and when in full
> duplex mode the NIC doesn't even look for collisions).  So what's going
> on?
> 
> I suspect that you're mistaken and that upping the loop counter was not
> the source of your success.

I still suspect that something else is going on, and slowing down the
machine, driver and network now hides the real problem.

Donald Becker				becker@scyld.com
Scyld Computing Corporation
410 Severn Ave. Suite 210
Annapolis MD 21403


-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to linux-vortex-request@beowulf.org