[Beowulf] core diameter is not really a limit

Mon Jun 17 14:49:25 PDT 2013

On Mon, 17 Jun 2013, Eugen Leitl wrote:
<interesting dreams of nanocomputers elided - Charles Stross's novels are 
entertaining extrapolations based on this kind of thing...>

>> everywhere.  OTOH, the idea of putting processors into memory has always
>> made a lot of sense to me, though it certainly changes the programming
>> model.  (even in OO, functional models, there is a "self" in the program...)
>
> Memory has been creeping into the CPU for some time. Parallella e.g.
> has embedded memory in the DSP cores on-die.

well, they have a tiny bit of memory per core - essentially software-managed,
globally-addressed cache-per-core.  it shouldn't make you sit up and go
"Hmm!".  I think it's more interesting to ponder the fact that there have 
always been some small experiments with putting (highly data-parallel)
processing onto the dram chip itself.  I mean, dram is fundamental: chips 
will be planar for a long time, therefore density demands a 2D storage array.
so a row decoder will read out a few Kb.  why not perform some data-parallel
operations row-wise, on the dram chip itself: you've got the row there anyway.

> Hybrid memory cube is
> about putting memory on top of your CPU.

this is just a slight power optimization: drive shorter wires.
I'm looking forward to 2.5D integration, but it's evolutionary...

> is mixing memory/CPU, even though that is currently problematic in
> the current fabrication processes.

I'm not sure how much blame can be attributed to the nature of processes
specialized to cpu vs dram.  at one time this was obvious: cpus on fast but 
high-leakage process being almost the perfect opposite of low-leakage dram.

but leakage has been a cpu issue for a long time now.  there even appears
to be some interesting convergence, with 3d/finfet transistor tech being 
used for dram arrays.  my guess is that preferences for say, doping levels
or oxide thickness do *not* form permanently conflicting fab constraints.

> The next step is something like
> a cellular FPGA,

yeah, no.  I don't actually think things will go in that direction, at least 
not for a long time, mainstream-wise.  but will we see systems that look like 
big grids of dimm-like pieces?  yes: processor-in-memory, not merely memory
organs supporting a distant, separate processor "brain"...

in some sense, the real question is how much of your system state is active
at any time.  computers are traditionally based on the assumption that most 
data is passively stored most of the time, and that we occasionally take out
some bits, mutate them, possibly store new versions.  Eugen is talking about
more of a stream-processing model, where there is limited passive state - 
ie, other than the state interlock between pipeline/cellular stages.  I think
we'll continue to have lots of passive, non-dynamic state, so our
architectures will still be based on random access to big arrays.
(dram, disk, flash, whatever.)

this also seems to be anthropomorphically comfortable...

regards, mark hahn.