strange behavior with 21143 cards

David Watson david@cs.ucsb.edu
Thu Aug 12 17:17:52 1999


I am working on a linux cluster which is being built, and I am having some
strange problems with tulip cards in the machines.  I'm hoping someone can 
help me out with a little advise (or a patch!)

Setup: 
tulip.c = v0.91g
cards = Kingston KNE100TX.  There are 2 of these PCI cards in each machine.
        /proc/pci reports DEC DC21142 (rev 65).  
        tulip driver reports Digital DS21143 Tulip rev 65
kernel = 2.2.5
arch = ix86(P2), 2 proc. SMP

Problem:
TCP connection tend to "lock up".  Interactive telnet and ssh connections
work fine, but ftp, scp, and http connections lock up almost immediatly.
I'm assuming that this has something to do with the larger packet size
generated by the file transfers.

Upon locking up, no more data is transfered over the open tcp stream,
although new ones can be started.  netstat reveals that the connection is
still ESTABLISHED on both sides.  Either on or both of the sides will have
stuff in it's send-queue for that connection, which never goes away.

Replacing the cards with some 3com and intel cards we have seems to fix
the problem.  Also these cards appear to function correctly when there is
only one in each box.

Does this sound familiar to anyone?  At this point we are looking to get
the cards replaced (with a different brand) by the vendor, but since we
have 100+ cards involved here, this isn't the most attractive option.

Please respond to my email "david@cs.ucsb.edu", as I'm not subscribed to
this list.

Thanks.


-david