[realtek] rtl8139_tx_interrupt [8139too] problem in Linux cluster

Narisara Thongboonchoo nthongbo@cgrer.uiowa.edu
Wed Sep 18 11:10:01 2002


Dear Donald and Realtek user,

All of my 4 nodes went down randomly one by one. I've tried to add more memory
from 512 MB to 1 GB but it's still down w/ the same error message. I also
changed the network driver to Netgear but it didn't solve the problem. Could
you provide me any suggestion?

Regards,

Narisara

On Sat, 14 Sep 2002, Donald Becker wrote:

> On Thu, 5 Sep 2002, Narisara Thongboonchoo wrote:
>
> > I had  troubles w/  4 nodes Linux cluster system when run program w/ MPI and
> > ssh command. However, I couldn't finnish my job since one of 4 nodes
> > keep random died.
>
> The same node, or different nodes?
>   If it's the same node every time, you shouldn't be looking for a
>   software fix.
>
> > The job was killed since there's no route to that machine.  I'm not
> > sure why it happended but found error messages about
> > rtl8139_tx_interrupt & rtl8139_interrupt. Is it possible that network
> > communication cause this problem? If so, could you give me any
> > suggestion?
>
> If this isn't a memory problem, then it's a device driver problem.  No
> user-level software should be able to cause this type of kernel error.
>
> > Call Trace: [<e098e308>] rtl8139_tx_interrupt [8139too] 0x128
> > [<e098e91a>] rtl8139_interrupt [8139too] 0xba
> > [<c0109c7a>] handle_IRQ_event [kernel] 0x3a
> > [<c0109df8>] do_IRQ [kernel] 0x68
> >
> > Code: ff 50 14 8b 00 29 32 c0 83 e0 d7 83 c8 04 5a a9 03 00 00
> > <0> kernel panic: Aiee, killing interrupt handler!
>
>

-- 

                                       ^---^
*********************************    >( . . )<   Meaw..Meaw
Narisara  Thongboonchoo                ..x..
326 Hawkeye Drive                    .   @  .
Iowa City, IA 52246, USA           . .     . .
Tel&Fax : (319) 353-4797 (home)      . .  x  . .
        : (319) 335-2063 (Office)   .m.  |  .m.
          (319) 335-3335 (Lab)        .  |  .
********************************     .*  |  *.
                                    *....|....*