Lite-On croaks under high load (0.90)

Corbett J. Klempay cklempay@acm.jhu.edu
Thu Dec 3 15:22:25 1998


Below I've attached a message from one of our members who is trying to get
our new Beowulf up.  Apparently, he has things mostly working, but (we
have experienced this consistently to date) the driver for the Lite-Ons
(there are 2 in the master node) chokes if a high load is induced...he is
able to restart the networking and get it running again, but another load
spike kills it once again....this sounds driver-related...is it?  Any help
would be *greatly* appreciated.

------------------------------------------------------------------------------
Corbett J. Klempay			         Quote of the Week:
http://www2.acm.jhu.edu/~cklempay  "Advice is what we ask for when we
				    already know the answer but wish we
				    didn't." 

PGP Fingerprint: 7DA2 DB6E 7F5E 8973 A8E7  347B 2429 7728 76C2 BEA1
------------------------------------------------------------------------------

---------- Forwarded message ----------
Date: Thu Dec  3 15:22:25 1998
From: Scott J. Lipcon <slipcon@acm.jhu.edu>
To: Justin Reaves <jreaves@chimera.acm.jhu.edu>,
     Corbett J. Klempay <cklempay@chimera.acm.jhu.edu>
Subject: status...

I've got it semi-working... arp tables, forwarding, etc.  Inner hosts can
ping outer hosts, and vice versa, through the master.  Unfortunately, the
master's networking still stops every once in a while, usually after a
period of high load (ie, flood ping, then try to telnet somewhere)  It can
be brought completely down and up again, including all routes + arps, and
restored to working.

I think its got to be a driver bug - Corbett, did you ever talk to Don or
the linux-tulip list?  

Scott