[Beowulf] bizarre scaling behavior on a Nehalem
tom.elken at qlogic.com
Thu Aug 13 16:37:07 PDT 2009
> On Behalf Of Christian Bell
> On Aug 12, 2009, at 11:14 AM, Bill Broadley wrote:
> > Is it really necessary for dynamic arrays
> > to be substantially slower than static?
> Yes -- when pointers, the compiler assumes (by default) that the
> pointers can alias each other, which can prevent aggressive
> optimizations that are otherwise possible with arrays.
> I remember stacking half a dozen pragmas over a
> 3-line loop on a Cray C compiler years ago to ensure that accesses
> where suitably optimized (or in this case, vectorized).
To add some details to what Christian says, the HPC Challenge version of STREAM uses dynamic arrays and is hard to optimize. I don't know what's best with current compiler versions, but you could try some of these that were used in past HPCC submissions with your program, Bill:
PathScale 2.2.1 on Opteron:
Base OPT flags: -O3 -OPT:Ofast:fold_reassociate=0
STREAMFLAGS=-O3 -OPT:Ofast:fold_reassociate=0 -OPT:alias=restrict:align_unsafe=on -CG:movnti=1
Intel C/C++ Compiler 10.1 on Harpertown CPUs:
Base OPT flags: -O2 -xT -ansi-alias -ip -i-static
Intel recently used
Intel C/C++ Compiler 11.0.081 on Nehalem CPUs:
-O2 -xSSE4.2 -ansi-alias -ip
and got good STREAM results in their HPCC submission on their ENdeavor cluster.
> . . christian
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin
> To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf