[Beowulf] Help with inconsistent network performance

Mark Hahn hahn at mcmaster.ca
Wed Dec 19 06:09:37 PST 2007


>> the first message should take <50 us.  the broadcast to 5 nodes should
>> take 2-3 more 50 us times.  so at about 200 us, all the slaves will start
>> the DOS attack on the viewer node's nic...
>
> I am not sure why you compare this to a DOS attack.  The same amount of data
> (and roughly the same amount of packets) should be arriving at the viewer
> node.  Yes it is stressing the switch more, but this switch should be able
> to handle much more traffic than this.

it's the _timing_ of the data.  using bcast, you attempt to cause the 
render nodes to, as simultaneously as possible, saturate their own
links, and therefore (N-1)-times oversaturate the viewer link.
it's exactly what you'd do if you wanted to provoke the switch to see
how it deals with congestion.

some form of credit or backpressure-based flow control would solve 
the problem entirely, but ethernet doesn't have that.  pause frames
might well solve the problem, but since it's not universally implemented,
I would guess it doesn't work that well.  normal TCP flow-control 
(switch drops packet(s), sender eventually notices lack of ack, etc)
will work, but is probably too agressive in backing off.  do you happen
to know which TCP version your kernel is implementing (cubic? probably
listed in the boot messages or in /proc/sys/net/ipv4/).  it's hard to 
find a TCP congestion algorithm that handles both lan and wan rates 
sensibly...

> 1MB / N -- then the hiccups must be coming from the final gather and not the
> broadcast.

yes, that's the part I'm calling the DOS ;)



More information about the Beowulf mailing list