[tulip] Question about RX-Drop errors and an appearant chipset lockup on the Phobos p430tx NIC.

Donald Becker becker@scyld.com
Sun Oct 20 21:08:00 2002


On Sun, 20 Oct 2002, Ben Greear wrote:

> > Which driver version are you using?  (0.95 or later is fine.)
> > What is the detection message?
> 
> I just pulled it from your site, here is the version info:
> 
> Oct 20 12:44:20 demo2 kernel: tulip.c:v0.95 6/21/2002  Written by Donald Becker <becker@scyld.com>

That version should recover correctly from no receive skbuffs.  The
recovery is based on the timer, so it won't necessarily be immediate.

> Another thing I noticed, rmmod fails (hangs) for both the kernel tulip and
> your own tulip nic.  Yours printed out something about freeing an invalid
> resource with some hex numbers.  I'll capture that next time I see it.

Make certain that _all_ of the references to the interfaces are
removed before you try removing the module.  That might involve killing
off processes such as 'dhcpcd' and 'pump'.  If you don't do this, the
later 2.4 kernels get confused. (Doh!)

> > This should not halt operation, although it will cause packet drops.
> 
> Hrm, is there any way to reserve a very large number of skbuffs to make
> this case less likely to hit?

This has been covered before.  You can raise the number of Rx ring
entries, but that is evil.  ("I'll avoid the risk of running out of gas
by carrying around a 100 gallon gas tank.")  Better, but directly
addressing the problem, is changing the free memory the kernel keeps
around: 

2.2 kernels
   echo 500 1000 2000 > /proc/sys/vm/freepages
and 2.4 kernels
   echo "100 500 200" > /proc/sys/vm/bdflush

-- 
Donald Becker				becker@scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Scyld Beowulf cluster system
Annapolis MD 21403			410-990-9993