[tulip] PCI reverse

Donald Becker becker@scyld.com
Tue Dec 18 19:45:01 2001


On Wed, 19 Dec 2001, David Flynn wrote:

> The pci_reverse patch is only a fix for machines with retarded BIOSes.  The
> problem is caused by the BIOS looking at devices on the other side of a PCI
> bridge in the wrong order.
> 
> Symptoms of this include, when the card is detected and configured by the
> driver, the first three ports will have an EEPROM not found error, and
> incorrect MAC addresses (by incorrect, i mean that it will look like this:

The v0.92/v0.93 driver will go back and correct the bogus
driver-generated station addresses when it does find the EEPROM.
But a bogus station address isn't a fatal problem -- it won't crash the
machine.  It's a bad IRQ assignment that is crashing the machine.

> Now the fix for this is to reverse the order devices on the other side of a
> PCI bridge are put into a list in the kernel. that is all the patch does.
> This then allows the driver to detect the first port as it should do (the
> one with the EEPROM), and all will almost work.  Note, the tulip driver
> attempts to fix the problem itself in 2.2.x kernels, and it is successfull.
> However, things changed in 2.4 and this no longer works.

The driver must also do one additional thing: it corrects a related x86 BIOS
bug that assigns different IRQs behind the bus bridge, when the bus
bridge actually puts all interrupts on a single IRQ.

For this to correction to work the driver must identify the primary
interface, which does get the correct IRQ assignment, and copy that IRQ
to the other interfaces behind the bus bridge.

The code at line 769 that does this is:

#if defined(__i386__)		/* Patch up x86 BIOS bug. */
		if (last_irq)
			irq = last_irq;
#endif
	}

> i will say it one more time, the patch only fixes problems with RETARDED
> BIOSES that have a slightly incorrect interpretation of the PCI (i think)
> specs.

Specifically, the fatl bug is BIOSes that only half understand bus
bridges.  Since everyone started out with the same PCI BIOS code, and
very few PCI cards have bus bridges, the bug has remained unfixed for
years.

> Now why i can not use the tulip driver (i last tried it when 2.4.5 was
> arround) i dont know.  it just hangs the system when you try and bring two
> interfaces up.

It hang when there is an interrupt that the device drivers cannot clear.

Note: The kernel could actually detect and report this case, but it
would add some slight operational overhead.  ("Hey, I've called these
interrupt handlers 10000 times and the interrupt still isn't cleared.)


Donald Becker				becker@scyld.com
Scyld Computing Corporation		http://www.scyld.com
410 Severn Ave. Suite 210		Second Generation Beowulf Clusters
Annapolis MD 21403			410-990-9993