[Beowulf] The GPU power envelope (was difference between accelerators)
hahn at mcmaster.ca
Fri Mar 15 19:25:29 PDT 2013
> I think what you've got here is basically the idea that "things that are
> closer, consume less power and cost less because you don't have the
> "interface cost".
everyone loves wide, fast channels - they just don't want to pay
for the power to drive them. there seems to be no problem creating
quite wide channels on current chips, even mounting them on interposers
that have massive trace density. as long as you can avoid or minimize
long traces on the motherboard (and sockets and stubs).
> A FPU sitting on the bus with the integer ALU inside the chip has minimum
> overhead.. No going on and off chip and the associated level shifting, no
> charging and discharging of the transmission lines, etc.
perhaps more importantly, fab shrinks are providing too many transistors.
look at a die shot these days, and you see that the cores are tiny little
islands surrounded by a vast sea of non-core. in a sense, this is a major
failure by chip architects to come up with ways to use transistors, other
than fairly boring ways like "extra bonus cache" and "now with a free GPU!".
(not that gpu architects are all that clever either, since stamping out
one basic ALU-like a couple thousand times is the state of the art ;)
the point is two-fold: gpus have already come onto the processor chip,
and the industry is already mass-producing 2.5 and 3d stacking/interposer
arrangements. the conjunction is interesting for HPC, since it brings
huge bandwidth and "array computing" speed into the package.
> This is why people are VERY interested in on chip optical transmitters and
> receivers (e.g. Things like VCSELs and APDs). You could envision a
> processor with an array of transmitters and receivers to create point to
> point links to other processors that are within the field of view. Only
> one "change of media"
free-space interconnect would be very cool, especially because it would
make big computers structurally interesting again. imagine building a
system of truncated cones, that when tiled into a sphere have an inner
empty space where each cone talks to all the others.
I wonder how long it'll be before a common computer maintenance task is
to dust the interconnect cavity ;)
More information about the Beowulf