Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Teraflop chip hints at the future

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Eugen Leitl eugen at leitl.org
Wed Feb 14 10:15:28 PST 2007


On Wed, Feb 14, 2007 at 09:51:21AM -0800, Jim Lux wrote:

> I'm not sure you could put any processor (except maybe something like 
> a microcontroller) into a DRAM design and keep the densities 
> up.  There are all sorts of things that might bite you.. aside from 

IBM has just announced at the ISSCC a 1-transistor eDRAM
substitute for the 6T-SRAM cell used in caches. (Others
have already demonstrated 1T-SRAM years ago, AMD has Z-RAM,
Intel Floating Body Cells, T-RAM doesn't need a capacitor,
etc. -- embedded RAM is reasonably common in network
processors, IIRC).

http://www.heise.de/newsticker/meldung/85295

It's 45 nm SOI (starting 2008), 1.5 ns access (SRAM does 0.8..1 ns),
and is supposed to be far more dissipation-friendly. Theoretically
this gives you 6 times the eDRAM of a CPU cache, which is at least
12 MBytes, and possibly up to 48 MBytes (Power6 dual-core has 8 MBytes
on-die cache).

> thermal issues, I suspect that the number of mask layers, etc. is 
> fairly small for DRAM.  The actual materials on the chip (doping 
> levels, etc.) may not allow for a reasonably performing processor 
> with reasonable feature sizes and thermal properties.  Getting the 
> heat away from the junction is a big deal.
> 
> I think DRAMs are built with a maximum of 4 layers of interconnect 
> with vias, while processors have a lot more layers and a much more 
> sophisticated interconnect structure.

Above processes are compatible with CPU processes, so there's some
hope the piggybacking in Terascale doesn't have to be forever.
 
> Each and every switch has some non-zero power associated with 
> changing state. Sure, the core swings smaller voltages and energies, 
> but a DRAM cell is a lot smaller than a flipflop or half-adder in the 
> CPU, and only one is changing at a time, as opposed to thousands.

At the horizon, there's MRAM which can also do logic with a little
extension to each cell (a kind of nonvolatile FPGA). It's not
that hugely fast, but it's static, and very low power.
 
> A big advantage of integrating CPU and memory, though, is that you 
> don't have to "go offchip" which saves a huge amount in 
> drivers/receivers, etc.   Of course, this is why everyone is looking 

Yes, this is a major advantage. No pads, too, but a few serial
high-speed links.

> to integrated photonics and/or real high speed serial 
> interconnects.  The I/O buffer might consume a hundred or thousand 
> times more power than the onchip logic driving it.  Trading some more 
> logic inside to serialize and deserialize, and do adapative 
> equalization, in exchange for fewer "wires out of the chip" is a good deal.
> 
> Then, there's the speed of light problem.  Put two chips 10cm apart 

Increasing density to true 3d integration is a very good way
to reduce the average distance. Stacking computation modules
on a 3d lattice also minimizes dead space, of course with
current cooling you won't get more than a few 10 MW out of
a paper basket volume before the cluster goes China syndrome.

> on a board, and the round trip time (say for address to get there and 
> data to get back) is going to be in the nanoseconds area, even if the 
> chip itself were infinitely fast.

The mammal CNS has a 120 m/s signalling limit, yet it can process pretty 
complex stimuli in few 10 ms.

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.ativel.com
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE



More information about the Beowulf mailing list