Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Intel Quad-Core or AMD Opteron

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Li, Bo libo at buaa.edu.cn
Fri Aug 24 07:57:31 PDT 2007


Intel will have CSI and on die memory controller soon following what AMD has done for a few years.
HT or CSI will help us build machines based on NUMA or similar architectures. 
Based on current memory technologies, I can't find any methods for "memory wall". And a 4 core processor can eat all memory bandwidth in some cases. With NUMA we can gat machines work as several current machine but connected with fast on-board connection. Image a super computer on desktop, and what's next?
Many-core processors are coming, how to power beowulf with them? I think it is a very interesting topic.
Power 6 is a really strange processor for me. It works with a in order architecture. I am looking forward to see any detailed evaluation for it.
Regards,
Li, Bo
----- Original Message ----- 
From: "Vincent Diepeveen" <diep at xs4all.nl>
To: "Toon Knapen" <toon.knapen at fft.be>
Cc: <beowulf at beowulf.org>; "Robert G. Brown" <rgb at phy.duke.edu>
Sent: Friday, August 24, 2007 9:37 PM
Subject: Re: [Beowulf] Intel Quad-Core or AMD Opteron


> Even worse,
> 
> Does SSE2 code of intel not by default in th eintel primitives have an 'if
> then else' that at opteron it runs without using SIMD?
> 
> But apart from that, SIMD at oldie K8 is very slow compared to core2,
> though not a factor 2. Barcelona for well optimized code should have an
> IPC in SIMD of up to 40+% faster i guess than core2.
> 
> So only 2 questions are when they release and especially at *what* price
> for the 4 socket mainboards.
> 
> A 16 core barcelona machine with 4 DDR2 memory controllers might be a very
> mighty system for all kind of applications that need shared memory to
> scale well.
> 
> When releasing Barcelona core within a few months from now, AMD has a huge
> lead over intel with respect to 4 core cpu's, as it seems to me.
> 
> I feel personally intels choice of CPU design using small tiny L1 caches
> from performance viewpoint is a catastrophic one. If there is just ONE
> competitor for an intel chip that manages to clock a cpu nearly at the
> same clock like intel and with the same number of cores, then intel
> usually gets totally outperformed. Now that intel & AMD produce
> cpu's at the same type of machines their cpu's, it seems to me
> that AMD will in general outperform intel.
> 
> Comparing the 2006 core2 with a 2003 release is not a very fair
> compare IMHO.
> 
> We can definitely conclude that intel managed to produce their new
> generation cpu ( core2) more than 1 year sooner than AMD did do, using a
> simple trick, namely glueing 2 dual core chips together.
> 
> In the meantime i keep wondering more and more about intel not having an
> equivalent on the market for AMD's hypertransport.
> 
> For highend, when buying multiple socket nodes, it is hard to see intel as
> an alternative to barcelona core driven machines, as it doesn't have any
> form of load balancing thanks to having just 1 memory controller for all
> cores.
> 
> Most interesting for scientists might be buying a few nodes with some
> double rail network and each node consisting out of 4 socket AMD machines
> quadcore. Initially now perhaps 2Ghz. Then in end 2008 you can
> upgrade the cpu's to 3+ Ghz.
> 
> When also putting a lot of RAM onto such AMD machine, then
> such a node of course also totally annihilates power6, even before power6
> gets taken into production, against a fraction of the price of a power6
> node.
> 
> The advantage of using 4 socket machines for a cluster/supercomputer is
> obviously the fact that the network costs form a smaller part of the total
> solution, meanwhile keeping the total number of nodes limited.
> 
> A few nodes you could arguably use 8 socket solutions for, not to scale up
> to more cores, as most software can't handle such bad memory latencies,
> but it might be you could even outgun power6 in terms of total memory a
> node.
> 
> What is the amount of ram that power6 supports versus the 8 socket AMD
> solutions?
> 
> Best Regards,
> Vincent
> 
> 
> 
> On Fri, 24 Aug 2007, Toon Knapen wrote:
> 
>> > I understand that, when comparing Quad-Core Xeons with Opterons,
>> > people focus on the scability issues of the different multi core
>> > architectures, but we've run some benchmarks on both and the thing
>> > that at the time surprised me the most was that if your application
>> > makes much use of the functions provided by Intel Math Kernel Library,
>> > a single Xeon core (e.g Clovertown) can be up to twice as fast as a
>> > single Opteron core.
>>
>>
>> You are comparing Intel MKL on Xeon with what exactly on Opteron? Intel
>> MKL on Opteron is certainly not optimal. I hope you compared to GotoBLAS
>> on Opteron.
>>
>> t
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>>
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list