[Beowulf] Help with inconsistent network performance
patrick at myri.com
Tue Dec 18 15:21:35 PST 2007
Hi Joe, Brendan
Joe Landman wrote:
>> Since it is a full duplex switched network, there should not be any
>> collisions happening. Since the image is less than 1 MB total, I don't
> There could be blocking ... if one unit grabs the single network pipe
> of the display node while the another node tries to send data, then the
> late node will back off (well with TCP it will) in a pre-determined manner.
It definitively looks like natural switch contention (N->1 pattern).
However, TCP's reaction will depend on how the switch itself handles
contention. If the hardware flow-control is turned off, packets will be
dropped in the switch, and TCP will quickly shrink its send window: big
hiccup. If the hardware flow-control is turned on, the sender NICs will
be paused and (hopefully) no packets are dropped. TCP will not be aware
of the backpressure and the send window may even increase a bit because
of the pausing delay: no big hiccup.
I don't know about the hardware flow-control implementation in the
Procurve 2848, and it may just be off by default like most Ethernet
switches. FWIW, there was no working hardware flow-control on the 10GigE
Procurve switch that I have played with, even when turned on.
More information about the Beowulf