[Beowulf] cpu's versus gpu's - was Intel buys QLogic InfiniBand business

Vincent Diepeveen diep at xs4all.nl
Mon Jan 23 17:44:10 PST 2012


In hardware you cannot beat manycore performance CPU's at the same  
cost structure;
cpu's have an exponential cost structure, for example to maintain  
cache-coherency.

This has many implications; for example also on size and scale.
If you produce a 1000 mm^2 cpu this is extremely expensive with real  
low yields,
whereas a 1000 mm^2 manycore is not a problem at all; cores that do  
not work you
can just turn off. There is no coherency.

So if you produce bigger cpu's, the price goes up per square  
millimeter, with manycores it scales near lineair.

If i remember well at 2007 a NCSA director already had put the  
implication
of this reality in his sheets, assuming by 2010 NCSA would build  
supercomputers
exclusively using manycores.

Note that manycores are not ideal for chess - they are however  
possible to use for majority of system
time that gets burned in HPC as majority of HPC needs throughput  
rather than latency.

Comparing bluegene machines with gpu's makes perfect sense of course  
as the latency
on them is also total crap.

I see the bluegene system by IBM as a genius move from IBM, starting  
an evolution,
moving away from huge expensive cpu's where you produce just a  
handful from in a total
outdated proces technology, with extremely bad yields,
with a milliondollar of startup costs, which by now woud be at todays  
factories approaching
20 million dollar startup costs just to print a  single batch of  
processors.

IBM developing power8 will have a serious problem with newer  
generation factories.
Every batch they print, every mistake it has, DANG 20 million dollar  
gone.

This concept of using simple cpu's, yet not that massively produced  
yet, obviously evoluted now
into a gpu, which is 1 total mass produced cheap chip, that  
integrates all those tiny cores into 1 cpu, which is way
cheaper.

What's price of a bluegene system per teraflop?

It's 500 euro for a 1 teraflop double precision Radeon HD7970...



On Jan 24, 2012, at 2:19 AM, Ellis H. Wilson III wrote:

> On 01/23/2012 07:40 PM, Vincent Diepeveen wrote:
>>>> On Jan 24, 2012, at 12:02 AM, Joshua mora acosta wrote:
>>>> Nanosecond latency of QPI using 2 rings versus something that has a
>>>> latency up to factor 1000 slower
>>>> with the pci-e as the slowest delaying factor.
>>>>
>>>> Doing cache coherency over that forget it.
>>>
>>> Hear that Shai F?  Stop work on vSMP now, cause Vincent says it  
>>> can't
>>> work!!!
>>>
>>> More seriously, with this acquisition, I could see serious  
>>> contention
>>> for ScaleMP.  SoC type stuff, using IB between many nodes, in
>>> smaller boxen.
>>
>> That would be some BlueGene type machine you speak about that intel
>> would produce with a low power SoC.
>>
>> This where at this point the bluegene type machines simply can't
>> compete with the tiny processors
>> that get produced by the dozens of millions.
>
> For...chess?  ;D
>
>> "The tiny processors have won"
>>      Linus Thorvalds
>
> *Torvalds, and if Linux (or any well-supported kernel/OS for that
> matter) currently had data structures designed for extremely high
> parallelism on a single MoBo (i.e. 100s to 10,000s of cores) then I
> would agree with this statement.  As I currently see it, all we can
> really say is that someday, probably, perhaps even hopefully:
>
> "The tiny processors will win."
>
> That's after we work out all the nasty nuances involved with designing
> new data structures for OSes that can handle that number of cores, and
> probably design new applications that can use these new OS features.
> And no, GPU support in Linux doesn't count as this already having been
> done.  We just farm out very specific code to run on those things.  If
> somebody has an example of a full-blown, usable OS running on a GPU
> ALONE, I would stand (very interestingly) corrected.
>
>> Intel has themselves a second law of Moore. You can google for it.
>
> Thanks, for a moment there, I almost used AskJeeves.
>
>> A good example of massproduced processors are gpu's.
>
> Was waiting for the hook.  Inevitable really.  I think if we were
> discussing the efficacy and quality of resultant bread from various
> bread machines versus the numerous methods for making bread by hand
> somehow, someway, a GPU would make better bread.  Might be a wholesome
> cyber-loaf of artisan wheat, but nonetheless, it would be better in
> every way.
>
> Best,
>
> ellis
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin  
> Computing
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf



More information about the Beowulf mailing list