[Beowulf] DARPA issues 20 MUSD grant to nVidia to go from 1 GFLOPS/Watt to 75 GFLOPS/Watt

Mon Dec 17 17:30:03 PST 2012

On Dec 17, 2012, at 11:23 PM, Mark Hahn wrote:

>> "todays 1 gflop/watt" ?
>
> press releases always put the new shiny thing in the best light.
> they're probably thinking of a conventional compute node,
> (say, 32 cores, 2.3 GHz, 4 flops/cycle, or 16c and 8 f/c -
> either way totalling 294 Gflops for 300W or less.)

For a fair compare you have to add motherboard power losses, as of  
course that's all included at the gpu cards.

As for the gflops it delivers, let's do a more realistic calculation.  
AVX does have multiply add,
yet i doubt you can issue on average every clock another multiply-add  
in a sustained manner at Sandy Bridge,
if we compare it with Nehalem.

Note the CPU's tend to have just 1 execution unit that can issue  
multiplications and historically always had big problems
issuing every clock another one; another reason why the manycores  
hammer away the CPU's so bigtime, as in the end
it doesn't matter whether you do matrix multiplications or run FFT's  
for prime numbers - it's about the multiplication speed
the chip can deliver as that's going to determine how fast your code  
can run on that chip.

http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M- 
Cache-2_90-GHz-8_00-GTs-Intel-QPI

That's the fastest i could find. It's 2.9Ghz CPU.

So the cpu delivers in terms of Gflops.

2.9Ghz * 1 multiplication a clock * 4 doubles a vector * 8 cores =  
92.8 Gflops
This for $2057 tray price at introduction.

http://ark.intel.com/products/64596/Intel-Xeon-Processor-E5-2690-20M- 
Cache-2_90-GHz-8_00-GTs-Intel-QPI

So i wonder where you got that 294 gflops from.

Now in terms of gflops/watt that's 92.8 / 135 watt TDP = 0.68 flops/ 
watt for the $2k Xeon.

One order of a magnitude less than the K20.
That's why intel created the Xeon Phi of course.

>
>> The K20X delivers 1.4 Tflop nearly.
>> If i google it's 235 watt TDP.
>>
>> 1.4 Tflop / 235 =  6 gflops/watt
>
> debatable whether we can honestly claim that's shipping.
> K10 is .78 Gflops DP/W or 17.2 SP.  I wonder of the 75 goal
> is merely a 4.4x improvement....

" The PERFECT program
will leverage anticipated industry fabrication geometry advances to 7  
nm."

7 nm gives a factor 16 boost over 28 nm, in theory. So the derived  
truth from the article points me to double precision.

> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf