[eepro100] EEpro100, Red Hat 7.1, wait_for_cmd_done timeout errors

Alexander Tarkhov alex@poplar.ru
Fri Feb 14 09:08:14 2003


David,

An overview would be like this:
There is a work-around that works in some cases -- upgraded Scyld driver
or driver from chip vendor.
There is no pretty much explanation and even less resolution to this
problem.
It seems like the problem originates from so deep into the chip design
and architecture, that it's just
not worth explaining in list like this - something about hw being
ready/not ready in unpredictable times.
(God, what happend to my English?)

But what is good about your case - you are the first to report about
multiple failures ("...some of the nodes...").
Because what we have seen before were just distinct unlucky specimen...
(Can't find the plural for this word in the Dictionary, pls. tell me
someone?)
So probably you experience real incompatibility of driver version /
kernel version,
which means the driver upgrade might help...

Good luck!
Alexander

David B. Ritch wrote:

 >There were a couple threads in January on this subject.  Was there ever
 >a resolution to this issue?
 >
 >I'm seeing some similar problems with the onboard NIC on a Tyan 2720
 >motherboard in a small cluster, and they're really strange.  A couple of
 >weeks ago, the problems stopped for no apparent reason.  Then we shipped
 >the system to a client site, and they came back.
 >
 >Pretty regularly, some of the nodes lose their ethernet after being up
 >for 90 minutes.  We are using the driver from Scyld, and sleep mode is
 >turned off.
 >
 >Since shipping the system seems to have triggered it again, I'm a little
 >suspicious of cables and connectors...
 >
 >Thanks,
 >
 >dbr
 >