Transmit timeouts in v1.06 (v1.09l is worse??)

Andreas Vierengel andreas@Vierengel.de
Tue Sep 7 13:00:41 1999



Mark Hagger wrote:
> 
> Hi,
> 
> I've been running kernel 2.2.5 under Redhat 5.2 on a parallel cluster of
> machines.  Unfortunately under high network, CPU and disk I/O load I kept
> getting repeated Transmit timeouts messages from the eepro100 driver, (v1.06),
> these effectively left the machine hung up and I typically had to power off to
> reboot it.
> 
> I've tried replacing v1.06 with the latest version v1.09l, but if anything this
> was worse, under the same conditions of load the machines now fatally crashes,
> I got a kernel oops once but it didn't appear in the syslog so I wasn't able to
> process it.  Other than that the machine typically locks solid (blank screen as
> well sadly), and I couldn't do anything except power off.
> 
> Is anyone out there having any sucess with these eepro100 cards, I see a number
> of people getting similiar problems with machines with high network traffic.
> 
> Unfortunately this is somewhat killing my parallel application, as I update my
> code to get better network throughput I am able to crash/hang the machines
> quicker......

:-)
But seriously, I had the same problems under high load, even with 2.0.x kernel.
I switched back to 1.05 and all is working again. Maybe for you, too ??

--Andy