[Beowulf] Re: vectors vs. loops

Joachim Worringen joachim at ccrl-nece.de
Tue May 3 03:48:33 PDT 2005

Robert G. Brown wrote:
> However, most code doesn't vectorize too well (even, as you say, with
> directives), so people would end up getting 25 MFLOPs out of 300 MFLOPs
> possible -- faster than a desktop, sure, but using a multimillion dollar
> machine to get a factor of MAYBE 10 in speedup compared to (at the time)
> $5-10K machines.  In the meantime, I'm sure that there were people who
> had code that DID vectorize well pulling their hair because of all those
> 100 hour accounts that basically wasted 90% of the resource.

This general statement is just wrong. Many scientific codes *do* vectorize well, 
and in this case, you do not get <10% of peak as you indicated, but typically 30 
to 50% or even more (see i.e. Leonid Oliker's recent papers on this topic) for 
the *overall application* (the vectorized loop alone is close to 100%) . This is 
a significant difference.

Another 'legend' comes up in your statement: a single node of a vector machine 
does not cost "multimillion" dollars. The price factor is quite close to the 
performance factor for suitable applications.

> My own code doesn't vectorize too well because it isn't heavy on linear
> algebra and the loops are over 3 dimensional lattices where
> nearest-neighbor sums CANNOT be local in a single memory stream and

Real vector architectures have very efficient scatter/gather memory operations, 
and support indirect addressing efficiently as well.


Joachim Worringen - NEC C&C research lab St.Augustin
fon +49-2241-9252.20 - fax .99 - http://www.ccrl-nece.de

More information about the Beowulf mailing list