[Beowulf] interconnect and compiler ?
lindahl at pbm.com
Fri Jan 30 14:36:36 PST 2009
On Fri, Jan 30, 2009 at 05:03:42PM -0500, Patrick Geoffray wrote:
> There are a tons of variables. The one I keep thinking about is PIO
> sending for larger message size than usual. If the data is in cache
> (reasonable assumption for send side), it can remove a lot of load from
> the memory bus compared to DMA. If your code is memory bandwidth bounded
> (aren't they all on multi-core ?), then you have a speedup.
I think that is an effect on the send side, but people (including you)
who've looked at PIO sends with interconnects which can do either PIO
or DMA tell me that their measured DMA threshold is pretty low.
Someone at QLogic can comment on what their software strategy is on
their DDR hardware, if they wish to talk about it.
> That's a warping of the (old and getting older) logp model :-)
Well, yes. Modern interconnects look less and less what the Berkeley
guys had in mind when they wrote that paper. Still, more pipelining
with shorter stages can only be better.
BTW, I agree with your comments on VLs, that's pretty much the same
rant I give on the topic. Since it's rare to have 2 implementations of
the same interconnect switch, it will be interesting to see what
implementation of VL QLogic puts in their IB switch chips, and how it
compares to the Mellanox chips. Fibre Channel is much more evil to do
well than IB, so we may be pleasantly surprised.
More information about the Beowulf