[Beowulf] interconnect and compiler ?

Thu Jan 29 18:05:07 PST 2009

On Thu, Jan 29, 2009 at 07:22:10PM -0500, Mark Hahn wrote:

> interesting.  the latency numbers I could find were 1.29 (HTX) vs 1.7.
> did the latency improve, as well as the overhead?  also, what changed
> the overhead?

Yes, the DDR PCIe adaptor had an improved latency as well as
overhead. One big item was that the chip-to-host link doubled in speed
(8 bit to 16 bit PCIe), to be similar to what the HTX chip had all
along. The chip-to-host link speed shows up as an overhead on the
sending side.

> I'll bite: suppose I run large MPI jobs (say, 1k rank)
> and have 8 cores/node and 1 nic/node.  under what circumstances
> would a node be primarily worried about message rate, rather than latency?

Well, let's say that you're doing a stencil computation on a 3D grid,
diagonals included. Then each cycle each core needs to send to 26
neighbors, and then receive from 26 neighbors. Even if you have
fat-ish nodes (~ 8 cores) and a clever layout of the cores onto the 3D
grid, that's a lot of off-node messages in a row. And that's message
rate.

-- greg