[Beowulf] substantial RX packet drops during Pallas over e1000 (Rocks 4.1)
hahn at physics.mcmaster.ca
Tue May 30 13:03:51 PDT 2006
> Does anyone have any thoughts as to why these dropped packets would only
> appear under PCIe?
bios support for pcie is clearly different, and there are certainly
some kernel differences as well. I wonder if you were using MSI in one
case, for instance (for which kernel support is in flux, I think).
but really, are you sure the nic chip really is identical save
for the PCI interface? and does the bios/kernel configure it
the same otherwise?
I wonder whether there might be a sort of memory starvation issue,
where under high packet rates, the pcie manages to starve the CPU
more, and hold off packet processing. ('dropped' means that the
irq-time handling happened, I think, but the next stage in the stack,
which happens under a different context, didn't have enough time.)
under pcie, did you see a proportionate number of irq's counted in
/proc/interrupts? did the ksoftirqd threads behave differently?
finally, I wonder whether anything changes if you run a uniprocessor
kernel (and/or disable irq migration, etc.)
More information about the Beowulf