Chipsets and memory speed...
Robert G. Brown
rgb at phy.duke.edu
Tue Dec 5 05:26:27 PST 2000
On Tue, 5 Dec 2000, Ken wrote:
> Slashdot just had a pointer to an article about memory. The article
> indicates that latency is the bottleneck with memory. You might try
> searching http://www.slasdot.org to see if you can find the article.
> Very detailed memory discussion.
You also might check out lmbench and the various tools one can use to
measure memory latencies and bandwidths.
The effect "speed" of memory as seen by the CPU depends on a variety of
things -- memory access pattern, speed (clock) of the memory itself,
whether or not it is interleaved, the kind of memory, the width of the
memory bus, the size and speed and organization of the intermediary
caching layers (L1 and L2).
The one thing to remember before running out and getting a system with
superfast incredible expensive memory is that your mileage will vary
tremendously with application from absolutely no visible speedup (or a
tiny 5% or so speedup) as a result of your substantial investment in a
better memory subsystem to doubling your overall performance (or
better). If your application's memory access pattern is such that
"cache works", you'll basically see little or no improvement in
performance with better memory because cache is successful in hiding the
memory's relatively slow speed from the CPU anyway. If your
application's memory access pattern is either a streaming pass through
very large blocks of memory (in e.g. vector operations for large vectors
and matrices) or completely random access of a large block of memory (so
that cache lookahead/prediction algorithms cannot work) then the speedup
with faster memory is directly visible and significant.
It is best to benchmark your application on a box with the "better"
memory to be sure that the cost/benefit of the transition is
advantageous. Also remember that sometimes a mere reorganization of
your code can yield a lot of the benefit of faster memory without the
cost -- ATLAS, for example, seeks to optimally hide memory speed behind
the caching subsystem for linear operations by adjusting algorithm and
stride and blocking so that the operations take place out of cache
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf