[vortex] Re: 3c905 errors:
Steven Timm
timm@fnal.gov
Mon Oct 13 10:54:01 2003
On Fri, 10 Oct 2003, Bogdan Costescu wrote:
> On Thu, 9 Oct 2003, Steven Timm wrote:
>
> > Up to this point I haven't been able to observe one of these errors as
> > it happens, but over the course of this week there has been an average
> > of two or three of them per node in a 240-node cluster.
>
> I always advise against using force media type, but would it be possible
> to force it for one or more computers to 100Mbit, full-duplex ? According
> to what you say above, it should be apparent in just a few days (if not
> sooner) what the outcome is. Please keep in mind to force both sides of
> the connection: the network card (with 'mii-diag -F') and the switch port.
It was forced initially, when we installed the nodes, on both sides.
The effect was that we had a number of network connections just
grind to a halt, and the link light go out and stay out, only to
be recovered with a power cycle and a switch reset.
Steve
>
> > Any advice on what might be happening?
>
> A completely blind shot: are the network cables all coming from the same
> manufacturer ?
Initially they were.. a few have been replaced. But every time
we've seen one of these errors, we've checked the physical
layer and usually don't see problems.
Steve
>
> --
> Bogdan Costescu
>
> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
> E-mail: Bogdan.Costescu@IWR.Uni-Heidelberg.De
>
>