[Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?

Mikhail Kuzminsky kus at free.net
Fri Oct 12 11:36:24 PDT 2007

I found 1st AMD quad core (Opteron 2347/1.9 Ghz) SPECfp2006 results 
(at www.spec.org) obtained by IBM: 11.2/10.7 for peak/base values. 
I'll say about 1 core only, i.e. for results w/Autoparallel=NO.

Let me look to other x86-64 microarchitecture w/same 4*64 bit FP 
results per cycle, i.e. Intel Core. For close frequency (1.86 Ghz, 
Xeon 5120) we may find close performance (10.9/10.7, for Bull SAS 
NovaScale R460 - for example). 

Let me now forget about cache sizes and memory throughput differences 
for AMD Barcelona and Intel Core microarchitectures, and their 
corresponding influence to performance. Then I may say that in some 
sense the "efficiency" (in the sense of performance, OK - SPECfp2006 
performance - per 1 Hz) of both microarchitectures are close.    

But if I'll compare SPECfp2006 results w/x86-64 microarchitecture 
w/2*64 bit FP results per cycle - previous Opteron generation - I'll 
see some strange (IMHO) result. So, for Opteron 2222SE/3 Ghz, AMD 
SPECfp2006 values are 15.2/14.3. But Xeon 5160, having 4 FP results 
per cycle, w/same 3.0 Ghz gives very close values  - 15.6/15.4 !
This means that 2 additional FP results per cycle in microarchitecture 
gives only about 7% of performance increase :-(

The question is - should we wait some better results for new incoming 
optimizing compilers versions ? Or it is the reality - that 2 
additional FP results per cycle gives (in average) relative small 
performance increase ?

Mikhail Kuzminsky
Zelinsky Institute of Organic Chemistry


More information about the Beowulf mailing list