[Beowulf] Nvidia's quantum leap in 28 nm
diep at xs4all.nl
Sun Mar 25 11:47:00 PDT 2012
It's been some year or 12 that a genius visited me. His expertise
being the same like Einsteins,
it's not much of a question what his research topics were.
Though not deep into computer hardware he told me that for massive
computing, just above the 1Ghz
border would prove to be a big barrier as electrons basically move at
around 1/3 of the lightspeed, which
translates to 1.3Ghz in metals like aluminium. At copper so he said
that barrier might be a tad higher
than aluminium, yet even then the power needed for such speeds would
prove to be massive.
At that moment intel's marketing department shouted out loud their
P4's would clock 10Ghz by 2010.
Well the P4 never got there and we got into the megacore count game
Now AMD needs 4 PE's for doing double precision, so their core count
of 1536 actually wasn't more than the 5000 series
with 1600. Their new 7970 gpu with 2048 pe's has the double precision
equivalent in core count of 512 compute cores.
Actually the 7970 mostly profits from a 100Mhz higher frequency with
some boosting to 1Ghz at some overclocked cards,
it gets impressive game scores. As for gpgpu of course, moving from
1536 cores to 2048 is an interesting improvement,
yet far away from a doubling. The 7970 is said to have around 4.31B
(see http://www.anandtech.com/show/5261/amd-radeon-hd-7970-review )
Fermi, nvidia's 40 nm gpu which currently gets used in HPC, it has 3
Here at home i have a few 2075 Tesla's with 448 cores producing a tad
more than 0.5 Tflop
which was its a big improvement over the previous generation.
The Nvidia Fermi on the other hand in the form of the GTX 560 clocks
1.644Ghz and the 580 clocks 1.544Ghz.
For gpgpu this is on the risky side as getting far over that 1Ghz
seems to be a problem. The tesla's therefore are clocked
NVIDIA KEPLER 2012
The new kid on the block from Nvidia is the Kepler. It's in the 28 nm
proces technology, just like AMD's 7970.
Now i'm not gonna redo a review for games, there is great sites for
Over here we are interested in the implications for the beowulf
systems of course, i read that as HPC implications.
Let's look to facts and then speculate what that means for HPC:
I'm still trying to full understand the differences, yet it seems as
if nvidia clocked back to 1Ghz the cores. That should make
it easier to release a gpu for gpgpu as well. In the meantime core
count went up to 1536.
The chip itself has 3.5 billion transistors. Just 500M more than
Fermi, meanwhile at a factor 2.04 smaller proces,
that means it will consume less juice and a lot less juice.
Benchmarks at anandtech confirm this.
Now that's a MASSIVE quantumleap. Basically factor 3 the number of
cores available to HPC.
Additional to that the memory is 256 bits wide, versus 384 bits for
Fermi. This should make it easier to release 2 gpu's on a single card.
Whether nvidia has those plans for gpgpu tesla's we can only
speculate about, as the chip eats less juice, it sure fits this time
power envelope. So where the gamer kids with sureness can expect a
690 gpu, for HPC we of course cheer if nvidia manages to
improve to 1.5 - 1.7 Tflop for their new gpu, with the option to move
to 3 - 3.4 Tflop double precision for a 2 gpu Tesla card.
Note that some might argue that the 680 has less double precision
capabilities than the 580. However for the Tesla this doesn't matter,
as what happens for gamerscards is that they disable some
transistors; so the Tesla gpu will be the exact same chip like the
just with the double precision enabled. The same thing was the case
with Fermi, so it's logical to expect that to happen with Kepler as
Seems like intel can also scrap their current corner project as they
have a new goal, namely 4 Tflop, rather than a 1 Tflop manycore :)
As for Nvidia, releasing a new chip that's factor 3 the power of your
previous one for gpgpu sure is a big quantum leap!
More information about the Beowulf