[Beowulf] Re: vectors vs. loops

Josip Loncaric josip at lanl.gov
Wed May 4 08:19:35 PDT 2005

Vincent Diepeveen wrote:
> You shift the bandwidth problem of the expensive network in that case to
> the processor itself.

That may work for games, but not for everyone.  A common operation like

C = A + B

is very fast when A, B, and C are small enough to fit into the cache 
simultaneously.  However, for scientific computing, the size of these 
vectors could be 1 GB each (per CPU!), and the problem is memory 
bandwidth bound.  Today's memory bandwidths cannot support full CPU 
speed on a problem like this.

A fact of life in scientific computing, e.g. CFD, is that the workload 
resembles "C=A+B".  People try to get better reuse of data in cache, but 
there is only so much that an algorithm will allow.  Thus, memory (and 
network) bandwidths remain the main bottleneck.


More information about the Beowulf mailing list