[Beowulf] substantial RX packet drops during Pallas over e1000 (Rocks 4.1)

Mark Hahn hahn at physics.mcmaster.ca
Tue May 30 13:03:51 PDT 2006


> Does anyone have any thoughts as to why these dropped packets would only 
> appear under PCIe?

bios support for pcie is clearly different, and there are certainly
some kernel differences as well.  I wonder if you were using MSI in one
case, for instance (for which kernel support is in flux, I think).

but really, are you sure the nic chip really is identical save 
for the PCI interface?  and does the bios/kernel configure it 
the same otherwise?

I wonder whether there might be a sort of memory starvation issue,
where under high packet rates, the pcie manages to starve the CPU
more, and hold off packet processing.  ('dropped' means that the 
irq-time handling happened, I think, but the next stage in the stack,
which happens under a different context, didn't have enough time.)

under pcie, did you see a proportionate number of irq's counted in 
/proc/interrupts?  did the ksoftirqd threads behave differently?
finally, I wonder whether anything changes if you run a uniprocessor
kernel (and/or disable irq migration, etc.)




More information about the Beowulf mailing list