[Beowulf] Help with inconsistent network performance
lindahl at pbm.com
Tue Dec 18 21:50:53 PST 2007
On Tue, Dec 18, 2007 at 09:05:41PM -0500, Patrick Geoffray wrote:
> No, it just means the NIC supports it.
Well, then how about ethtool -S? That looks like an actual count of
flow control events, so rx flow control events means the switch
must support it in some fashion.
> For RX hardware flow-control, you need enough buffer space to keep one
> full frame plus the latency on the longest wire, for every port. It is a
> bit more expensive to do with 10GigE, because you need faster memory and
> more of it. Some recent 10GigE chips use a shared SRAM buffer that is
> not big enough for the worst case with 9K packets:
Well, we know it can be done perfectly, it's done in InfiniBand
switches, and that other 10 gig non-ethernet switch, what's it called?
Oh yeah, Myrinet. They do it, too.
> Flow-control is not for everyone, and that's why it is often turned off
> by default. When a sender is paused, it will stop sending anything,
> including packets for different destinations. Dropping packets is
> expensive to recover but it keeps things moving.
Can Myrinet even disable flow control? Odd that Ethrernet is any
different; dropping any packets is an utter disaster for TCP.
More information about the Beowulf