[Beowulf] Performance characterising a HPC application
greg.lindahl at qlogic.com
Mon Mar 26 14:35:15 PDT 2007
On Mon, Mar 26, 2007 at 03:58:59PM -0500, Richard Walsh wrote:
> but of course aggregation is a legitimate optimization technique
> because not all message patterns are of the Gups variety just as not
> all memory references are absent locality.
This is true, although I would call it more of an "ease of use" issue.
Everyone in the MPI arena already knows they're supposed to send as
big of a message as possible, so it's fairly rare to find
high-performance MPI codes that see an improvement with message
aggregation. In an ideal world the programmer wouldn't have to
explicitly think about aggregation, it would just happen. But today's
codes don't assume that.
I only referred to GUPs as it's a widely available microbenchmark
which is not gamed by this optimization. But message rate and
streaming bandwidth are completely wrecked.
More information about the Beowulf