[realtek] strange problem ....

Paul Campbell paul@taniwha.com
Tue, 11 Jul 2000 23:51:47 -0700


I have a bunch of headless SMP systems with 8139s in them, I've been chasing
a wierd phantom where a system will slow down for a while - pings take 
many seconds - but don't get lost - you can log in - it's slow but works OK
then suddenly it comes back and it's OK.

I'm running 2.2.16 kernels with the 1.10 driver - for various reasons I can't go
to 2.3/2.4 at the moment

I'm beginning to think maybe this is an SMP related bug - maybe driver's
input queue book keeping is getting screwed up and packets are just hanging
around untill a following packets forces them in.

For example:
ping -i 10  10.100.0.17
PING 10.100.0.17 (10.100.0.17): 56 data bytes
64 bytes from 10.100.0.17: icmp_seq=0 ttl=255 time=1202.2 ms
64 bytes from 10.100.0.17: icmp_seq=1 ttl=255 time=6992.8 ms
64 bytes from 10.100.0.17: icmp_seq=5 ttl=255 time=0.3 ms
64 bytes from 10.100.0.17: icmp_seq=2 ttl=255 time=30000.4 ms
64 bytes from 10.100.0.17: icmp_seq=3 ttl=255 time=20000.4 ms
64 bytes from 10.100.0.17: icmp_seq=4 ttl=255 time=10000.4 ms
64 bytes from 10.100.0.17: icmp_seq=6 ttl=255 time=6000.2 ms
64 bytes from 10.100.0.17: icmp_seq=7 ttl=255 time=6993.2 ms
64 bytes from 10.100.0.17: icmp_seq=11 ttl=255 time=20.1 ms
64 bytes from 10.100.0.17: icmp_seq=8 ttl=255 time=30020.1 ms
64 bytes from 10.100.0.17: icmp_seq=9 ttl=255 time=20020.2 ms
64 bytes from 10.100.0.17: icmp_seq=10 ttl=255 time=10020.2 ms
64 bytes from 10.100.0.17: icmp_seq=12 ttl=255 time=7000.2 ms
64 bytes from 10.100.0.17: icmp_seq=13 ttl=255 time=6993.7 ms
64 bytes from 10.100.0.17: icmp_seq=17 ttl=255 time=0.3 ms 

The fact that some pings bypass others is completely mystifying to me
(but there is other network traffic to this box which may be perturning the
above)

Another clue - pinging with a long packet seems to make it 'come right'
for example 'ping -s 16384  10.100.0.17' fixed the above problem
    
I've noticed that the 8139too driver has a lot of SMP spinlocks in it as
do other 2.2 net drivers (on the other hand I know it's a 2.3 driver and
2.3 has finer grained locking)

I'm kind of stuck and down to the wire on my project - any clues or ideas 
would be appreciated (should I wade in and toss spin locks in the right places?
is there a known FIFO problem? are there bugs fixed in 8139too that I can
fix in the 2.2 driver? etc etc)

	many thanks in advance

	Paul Campbell
	paul@taniwha.com