Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Help with inconsistent network performance

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Tue Dec 18 21:55:51 PST 2007


> I guess I figured that the data is relatively small compared to the
> bandwidth,

I agree, in principle.  and relatively small compared to the amount of ram
in the switch as well.

> whereas the latency for ethernet is relatively high.  I also

not _that_ high, though.  with a little tuning (coalesce parameters),
I think 30-40 us half-rtt is pretty common, even over a normal 
tcp stack.  yes, that's 2+ 1.5k packets, but it not _that_ much 
compared to 1M images.

>> To make sure there was not an issue with the MPI broadcast, I did one test
>>> run with 5 nodes only sending back 4 bytes of data each.  The result was
>> a
>>> RTT of less than 0.3 ms.
>>
>> isn't that kind of high?  a single ping-pong latency should be ~50 us -
>> maybe I'm underestimating the latency of the broadcast itself.
>
>
> This is quite a bit more than a single ping-pong. The viewer sends to the
> master node (rank 0), and then the master node broadcasts to all other
> nodes, and then all nodes send back to the viewer node.  I don't know if
> this is still seems high?

the first message should take <50 us.  the broadcast to 5 nodes should 
take 2-3 more 50 us times.  so at about 200 us, all the slaves will start
the DOS attack on the viewer node's nic...

> But the bcast is always just sending 4 bytes (a single integer), and as

no, afaik no mpi implementations actually utilize the eth-level bcast,
but rather implement bcast as a tree of (uni) sends.



More information about the Beowulf mailing list