<br><br><div class="gmail_quote">2008/12/5 Robert G. Brown <span dir="ltr"><<a href="mailto:rgb@phy.duke.edu">rgb@phy.duke.edu</a>></span><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

On Fri, 5 Dec 2008, Eugen Leitl wrote:<br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

(Well, duh).<br>

</blockquote>

<br>

Good article, though, thanks.<br>

<br>

Of course the same could have been written (and probably was) back when<br>

dual processors came out sharing a single memory bus, and for every<br>

generation since.  The memory lag has been around forever -- multicores<br>

simply widen the gap out of step with Moore's Law (again).<br>

<br>

Intel and/or AMD people on list -- any words you want to say about a<br>

"road map" or other plan to deal with this?  In the context of ordinary<br>

PCs the marginal benefit of additional cores after (say) four seems<br>

minimal as most desktop users don't need all that much parallelism --<br>

enough to manage multimedia decoding in parallel with the OS base<br>

function in parallel with "user activity". Higher numbers of cores seem<br>

to be primarily of interest to H[A,PC] users -- stacks of VMs or server<br>

daemons, large scale parallel numerical computation. </blockquote><div><br>Datamining is useful for both commercial and scientific world and is very data-intensive, so I think this issue will be adressed, or at least someone (Sun, for example) will build processors for data intensive applications that are more balanced, but several times more expensive.<br>

 </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> In both of these<br>

general arenas increasing cores/processor/memory channel beyond a<br>

critical limit that I think we're already at simply ensures that a<br>

significant number of your cores will be idling as they wait for<br>

memory access at any given time...<br>

<br>

   rgb<div><div></div><div class="Wj3C7c"><br>

<br>

<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<br>

<a href="http://www.spectrum.ieee.org/nov08/6912" target="_blank">http://www.spectrum.ieee.org/nov08/6912</a><br>

<br>

Multicore Is Bad News For Supercomputers<br>

<br>

By Samuel K. Moore<br>

<br>

Image: Sandia<br>

<br>

Trouble Ahead: More cores per chip will slow some programs [red] unless<br>

there's a big boost in memory bandwidth [yellow<br>

<br>

With no other way to improve the performance of processors further, chip<br>

makers have staked their future on putting more and more processor cores on<br>

the same chip. Engineers at Sandia National Laboratories, in New Mexico, have<br>

simulated future high-performance computers containing the 8-core, 16‑core,<br>

and 32-core microprocessors that chip makers say are the future of the<br>

industry. The results are distressing. Because of limited memory bandwidth<br>

and memory-management schemes that are poorly suited to supercomputers, the<br>

performance of these machines would level off or even decline with more<br>

cores. The performance is especially bad for informatics<br>

applications—data-intensive programs that are increasingly crucial to the<br>

labs' national security function.<br>

<br>

High-performance computing has historically focused on solving differential<br>

equations describing physical systems, such as Earth's atmosphere or a<br>

hydrogen bomb's fission trigger. These systems lend themselves to being<br>

divided up into grids, so the physical system can, to a degree, be mapped to<br>

the physical location of processors or processor cores, thus minimizing<br>

delays in moving data.<br>

<br>

But an increasing number of important science and engineering problems—not to<br>

mention national security problems—are of a different sort. These fall under<br>

the general category of informatics and include calculating what happens to a<br>

transportation network during a natural disaster and searching for patterns<br>

that predict terrorist attacks or nuclear proliferation failures. These<br>

operations often require sifting through enormous databases of information.<br>

<br>

For informatics, more cores doesn't mean better performance [see red line in<br>

"Trouble Ahead"], according to Sandia's simulation. "After about 8 cores,<br>

there's no improvement," says James Peery, director of computation,<br>

computers, information, and mathematics at Sandia. "At 16 cores, it looks<br>

like 2." Over the past year, the Sandia team has discussed the results widely<br>

with chip makers, supercomputer designers, and users of high-performance<br>

computers. Unless computer architects find a solution, Peery and others<br>

expect that supercomputer programmers will either turn off the extra cores or<br>

use them for something ancillary to the main problem.<br>

<br>

At the heart of the trouble is the so-called memory wall—the growing<br>

disparity between how fast a CPU can operate on data and how fast it can get<br>

the data it needs. Although the number of cores per processor is increasing,<br>

the number of connections from the chip to the rest of the computer is not.<br>

So keeping all the cores fed with data is a problem. In informatics<br>

applications, the problem is worse, explains Richard C. Murphy, a senior<br>

member of the technical staff at Sandia, because there is no physical<br>

relationship between what a processor may be working on and where the next<br>

set of data it needs may reside. Instead of being in the cache of the core<br>

next door, the data may be on a DRAM chip in a rack 20 meters away and need<br>

to leave the chip, pass through one or more routers and optical fibers, and<br>

find its way onto the processor.<br>

<br>

In an effort to get things back on track, this year the U.S. Department of<br>

Energy formed the Institute for Advanced Architectures and Algorithms.<br>

Located at Sandia and at Oak Ridge National Laboratory, in Tennessee, the<br>

institute's work will be to figure out what high-performance computer<br>

architectures will be needed five to 10 years from now and help steer the<br>

industry in that direction.<br>

<br>

"The key to solving this bottleneck is tighter, and maybe smarter,<br>

integration of memory and processors," says Peery. For its part, Sandia is<br>

exploring the impact of stacking memory chips atop processors to improve<br>

memory bandwidth.<br>

<br>

The results, in simulation at least, are promising [see yellow line in<br>

"Trouble Ahead<br>

<br>

_______________________________________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org" target="_blank">Beowulf@beowulf.org</a><br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

<br>

</blockquote>

<br></div></div><font color="#888888">

Robert G. Brown                        <a href="http://www.phy.duke.edu/%7Ergb/" target="_blank">http://www.phy.duke.edu/~rgb/</a><br>

Duke University Dept. of Physics, Box 90305<br>

Durham, N.C. 27708-0305<br>

Phone: 1-919-660-2567  Fax: 919-660-2525     <a href="mailto:email%3Argb@phy.duke.edu" target="_blank">email:rgb@phy.duke.edu</a><br>

<br>

</font><br>_______________________________________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a><br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

<br></blockquote></div><br>