[tulip] many tulip problems with a cogent 4port nic

David Flynn Dave@keston.u-net.com
Wed, 25 Jul 2001 12:31:58 +0100


BlankSorry guys if this is a repeat, it seems that the mail server may have
skrewed something up ...

----- Original Message -----
From: "Donald Becker" <becker@scyld.com>
To: <Dave@keston.u-net.com>
Cc: <tulip@scyld.com>
Sent: Tuesday, July 17, 2001 1:14 PM
Subject: Re: [tulip] many tulip problems with a cogent 4port nic

Sorry its been a while, ive been on holiday for a week ...

Ok, i managed to (eventually) compile your pci-scan and tulip drivers on
kernel 2.4.5 without getting unresolved symbols using the following command
:

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -O2 -fom
it-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 -ma
rch=i586 -DMODULE -DMODVERSIONS -include
/usr/src/linux/include/linux/modversions.h -c tulip.c

gcc -D__KERNEL__ -I/usr/src/linux/include -Wall -Wstrict-prototypes -O2 -fom
it-frame-pointer -fno-strict-aliasing -pipe -mpreferred-stack-boundary=2 -ma
rch=i586 -DMODULE -DMODVERSIONS -include
/usr/src/linux/include/linux/modversions.h -c pci-scan.c


> On Sat, 14 Jul 2001 Dave@keston.u-net.com wrote:
>
> >      i seem to be in the unfortunate position of owning a motherboard
> > with an AMI bios, and a 4 port NIC.  The NIC is a cogent 4port thingy,
> > with four DEC 21040 chips and the 21050 ( i think) pci-pci bridge. (i
> > am sorry i cannot be more specific but i cant gain access to the card,
> > and am working from memory on what it is)
>
> There are several Cogent (now Adaptec) 4 port cards.
> 4*21040
^^^^ its this one

> 4*21140
> 4*21143 w/ SYM
> 4*21143 w/ MII
>
> > Firstly, i am running linux kernel 2.4.5, and using the tulip driver
> > built with the kernel sources.
> > It would seem that i have a problem with the probing of the card, ie,
> > the card has one (S?)ROM and the probing is being done in the wrong
> > order (apparently a bios problem, yes ?)
>
> I haven't checked the 2.4 scan code -- the scan order could be a Linux
> issue.

ok, done some playing with this, using the 'old' 2.2.19 tulip driver it only
detects correctly with reverse_probe=1
and the card seems to work fine, (ive tested it with eth0 and eth3 both
running simultaniously, and it doesnt crash)

using the new tulip test driver under 2.2.19, the probing is done in the
wrong order, and the following is outputted :

tulip.c:v0.92w 7/9/2001 Written by Donald Becker <becker@scyld.com>
  http://...yadda
eth0: Digital DC21040 Tulip rev 36 at 0xc281bc00, EEPROM not present,
00:4C:69:6E:75:79 IRQ 11
eth1: Digital DC21040 Tulip rev 36 at 0xc281d800, EEPROM not present,
00:4C:69:6E:75:79 IRQ 11
eth2: Digital DC21040 Tulip rev 36 at 0xc281f400, EEPROM not present,
00:4C:69:6E:75:79 IRQ 11
eth3: Digital DC21040 Tulip rev 36 at 0xc2821000, 00:00:92:92:39:58 IRQ 3

if i try and bring up any interface the system hangs.

likewise for 2.4.5, the driver does exactly the same, however, under 2.4.5
if you try to remove the module the following error occurs:


Trying to free nonexistent resource <c2821000-c282107f>
Trying to free nonexistent resource <c281f400-c281f47f>
Trying to free nonexistent resource <c281d800-c281d87f>
Trying to free nonexistent resource <c281bc00-c281bc7f>

this does not happen under 2.2.19

>
> > here is the output from dmesg with respect to the tulip driver:
> > Linux Tulip driver version 0.9.15-pre2 (May 16, 2001)
> > eth0: Digital DC21040 Tulip rev 36 at 0xfc80, EEPROM not present,
00:4C:69:6E:75:79, IRQ 11.
> ...
> > eth3: Digital DC21040 Tulip rev 36 at 0xf800, 00:00:92:92:39:58, IRQ 3.
>
> > the first three eth ports have incorrect MAC addresses, bassed on
> > \0LINUX (this is why i am guessing the probing is in the wrong order),
> > and am i also correct in thinking the IRQ's are wrong ?
>
> A common x86 BIOS bug is failing to correctly assign the IRQs to the
> devices on the other side of a PCI bus bridge.  Few cards use bus
> bridges, so this problem was around for years.
>
> I worked around the IRQ problem in my drivers by copying the IRQ setting
> from the first device.
>
> > here is an output from # ./lspci -vv
> > 00:0d.0 Class 0604: 1011:0001 (rev 02)
> >         Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
> >         I/O behind bridge: 0000f000-0000ffff
> >         Memory behind bridge: ff900000-ff9fffff
> > 01:04.0 Class 0200: 1011:0002 (rev 24)
> >         Interrupt: pin A routed to IRQ 3
> >         Region 0: I/O ports at f800 [size=128]
> >         Region 1: Memory at ff9ff000 (32-bit, non-prefetchable)
[size=128]
> >
> > 01:05.0 Class 0200: 1011:0002 (rev 24)
> >         Interrupt: pin A routed to IRQ 9
> >         Region 0: I/O ports at f880 [size=128]
> >         Region 1: Memory at ff9ff400 (32-bit, non-prefetchable)
[size=128]
> >
> > 01:06.0 Class 0200: 1011:0002 (rev 24)
> >         Interrupt: pin A routed to IRQ 10
> >         Region 0: I/O ports at fc00 [size=128]
> >         Region 1: Memory at ff9ff800 (32-bit, non-prefetchable)
[size=128]
> >
> > 01:07.0 Class 0200: 1011:0002 (rev 24)
> >         Interrupt: pin A routed to IRQ 11
> >         Region 0: I/O ports at fc80 [size=128]
> >         Region 1: Memory at ff9ffc00 (32-bit, non-prefetchable)
[size=128]
>
>
> > i was under the impression that the drivers now didnt have a problem
> > with reverse probing, the options in your latest sources apear to be
> > depreciated, and dont exist anywhere else.
>
> My driver has explicit code to back-copy the media information to
> previously discovered chips.

which isnt working in this case, as it hasnt detected any media information
yet (reverse_probe needed in older drivers)

> > however, i am under the impression that your drivers are incompatable
> > with the 2.4.x kernels (i cant get them to compile, and you say
> > something about it on your site)
>
> The versions in the 'test' directly mostly work with 2.4, however I do
> not test with the 2.4 kernel.

see above

>
> > i have tried the de4x5 driver (in the kernel tree) and it doesnt seem
> > to want to play ball either, however, it does seemingly get the IRQ's
> > right.
>
> > the final thing, which may be related is the fact with either driver,
> > tulip or de4x5 in the kernel tree, i can bring up any of the four
> > interfaces, onely one of which works, however, if i try to bring up
> > two interfaces the whole system locksup with out any messages of any
> > sort.
>
> Uhmmm, what makes you believe that the de4x5 driver has the interrupt
> assignment "right"?

on this issue i am only speculating, but you can probabally answer this
better:

under 2.2.19 (and 2.4)

a cat of /proc/pci reveals :

Bus  1, device   7, function  0:
    Ethernet controller: DEC DC21040 (rev 36).
      Medium devsel.  Fast back-to-back capable.  IRQ 11.  Master Capable.
Latency=66.
      I/O at 0xfc80 [0xfc81].
      Non-prefetchable 32 bit memory at 0xff9ffc00 [0xff9ffc00].
  Bus  1, device   6, function  0:
    Ethernet controller: DEC DC21040 (rev 36).
      Medium devsel.  Fast back-to-back capable.  IRQ 10.  Master Capable.
Latency=66.
      I/O at 0xfc00 [0xfc01].
      Non-prefetchable 32 bit memory at 0xff9ff800 [0xff9ff800].
  Bus  1, device   5, function  0:
    Ethernet controller: DEC DC21040 (rev 36).
      Medium devsel.  Fast back-to-back capable.  IRQ 9.  Master Capable.
Latency=66.
      I/O at 0xf880 [0xf881].
      Non-prefetchable 32 bit memory at 0xff9ff400 [0xff9ff400].
  Bus  1, device   4, function  0:
    Ethernet controller: DEC DC21040 (rev 36).
      Medium devsel.  Fast back-to-back capable.  IRQ 3.  Master Capable.
Latency=66.
      I/O at 0xf800 [0xf801].
      Non-prefetchable 32 bit memory at 0xff9ff000 [0xff9ff000].

note the IRQs, now your driver sets them all to IRQ3, the de4x5 sets them to
the above values ....
which is correct ? (note that it is only in the old tulip driver that i can
get more than 2 interfaces up at any one time)


>
> Please try my driver with 2.2 and 2.4, and send a report.
>

that is my report, basically under either kernel, the test version of the
tulip driver (0.92w) does not work, and crashes when you try to bring up an
interface.

if there is anything anyone can sugest ? (any horrid kludges to force the
reverse probe ?, simmilar to the old tulip driver ?)

if there is any further details people want, please ask, and ill get them
for you.

donald, what are the major changes to the tulip driver thats in the 2.2.19
kernel and the test one ? and would it be worth trying to port the 2.2.19
driver to 2.4, as it seems to work under 2.2.19, where as nothing else does
(as a tempory solution)?, or is it worth  finding out and fixing the actual
problem ?

Thanks,

Dave
>
> Donald Becker becker@scyld.com