[Beowulf] Intel Quad-Core or AMD Opteron

Li, Bo libo at buaa.edu.cn
Fri Aug 24 07:57:31 PDT 2007

Intel will have CSI and on die memory controller soon following what AMD has done for a few years.
HT or CSI will help us build machines based on NUMA or similar architectures. 
Based on current memory technologies, I can't find any methods for "memory wall". And a 4 core processor can eat all memory bandwidth in some cases. With NUMA we can gat machines work as several current machine but connected with fast on-board connection. Image a super computer on desktop, and what's next?
Many-core processors are coming, how to power beowulf with them? I think it is a very interesting topic.
Power 6 is a really strange processor for me. It works with a in order architecture. I am looking forward to see any detailed evaluation for it.
Li, Bo
----- Original Message ----- 
From: "Vincent Diepeveen" <diep at xs4all.nl>
To: "Toon Knapen" <toon.knapen at fft.be>
Cc: <beowulf at beowulf.org>; "Robert G. Brown" <rgb at phy.duke.edu>
Sent: Friday, August 24, 2007 9:37 PM
Subject: Re: [Beowulf] Intel Quad-Core or AMD Opteron

> Even worse,
> Does SSE2 code of intel not by default in th eintel primitives have an 'if
> then else' that at opteron it runs without using SIMD?
> But apart from that, SIMD at oldie K8 is very slow compared to core2,
> though not a factor 2. Barcelona for well optimized code should have an
> IPC in SIMD of up to 40+% faster i guess than core2.
> So only 2 questions are when they release and especially at *what* price
> for the 4 socket mainboards.
> A 16 core barcelona machine with 4 DDR2 memory controllers might be a very
> mighty system for all kind of applications that need shared memory to
> scale well.
> When releasing Barcelona core within a few months from now, AMD has a huge
> lead over intel with respect to 4 core cpu's, as it seems to me.
> I feel personally intels choice of CPU design using small tiny L1 caches
> from performance viewpoint is a catastrophic one. If there is just ONE
> competitor for an intel chip that manages to clock a cpu nearly at the
> same clock like intel and with the same number of cores, then intel
> usually gets totally outperformed. Now that intel & AMD produce
> cpu's at the same type of machines their cpu's, it seems to me
> that AMD will in general outperform intel.
> Comparing the 2006 core2 with a 2003 release is not a very fair
> compare IMHO.
> We can definitely conclude that intel managed to produce their new
> generation cpu ( core2) more than 1 year sooner than AMD did do, using a
> simple trick, namely glueing 2 dual core chips together.
> In the meantime i keep wondering more and more about intel not having an
> equivalent on the market for AMD's hypertransport.
> For highend, when buying multiple socket nodes, it is hard to see intel as
> an alternative to barcelona core driven machines, as it doesn't have any
> form of load balancing thanks to having just 1 memory controller for all
> cores.
> Most interesting for scientists might be buying a few nodes with some
> double rail network and each node consisting out of 4 socket AMD machines
> quadcore. Initially now perhaps 2Ghz. Then in end 2008 you can
> upgrade the cpu's to 3+ Ghz.
> When also putting a lot of RAM onto such AMD machine, then
> such a node of course also totally annihilates power6, even before power6
> gets taken into production, against a fraction of the price of a power6
> node.
> The advantage of using 4 socket machines for a cluster/supercomputer is
> obviously the fact that the network costs form a smaller part of the total
> solution, meanwhile keeping the total number of nodes limited.
> A few nodes you could arguably use 8 socket solutions for, not to scale up
> to more cores, as most software can't handle such bad memory latencies,
> but it might be you could even outgun power6 in terms of total memory a
> node.
> What is the amount of ram that power6 supports versus the 8 socket AMD
> solutions?
> Best Regards,
> Vincent
> On Fri, 24 Aug 2007, Toon Knapen wrote:
>> > I understand that, when comparing Quad-Core Xeons with Opterons,
>> > people focus on the scability issues of the different multi core
>> > architectures, but we've run some benchmarks on both and the thing
>> > that at the time surprised me the most was that if your application
>> > makes much use of the functions provided by Intel Math Kernel Library,
>> > a single Xeon core (e.g Clovertown) can be up to twice as fast as a
>> > single Opteron core.
>> You are comparing Intel MKL on Xeon with what exactly on Opteron? Intel
>> MKL on Opteron is certainly not optimal. I hope you compared to GotoBLAS
>> on Opteron.
>> t
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org
>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

More information about the Beowulf mailing list