[Beowulf] quad-core SPECfp2006: where are 4 FPresults/cycle ?
kus at free.net
Fri Oct 12 11:36:24 PDT 2007
I found 1st AMD quad core (Opteron 2347/1.9 Ghz) SPECfp2006 results
(at www.spec.org) obtained by IBM: 11.2/10.7 for peak/base values.
I'll say about 1 core only, i.e. for results w/Autoparallel=NO.
Let me look to other x86-64 microarchitecture w/same 4*64 bit FP
results per cycle, i.e. Intel Core. For close frequency (1.86 Ghz,
Xeon 5120) we may find close performance (10.9/10.7, for Bull SAS
NovaScale R460 - for example).
Let me now forget about cache sizes and memory throughput differences
for AMD Barcelona and Intel Core microarchitectures, and their
corresponding influence to performance. Then I may say that in some
sense the "efficiency" (in the sense of performance, OK - SPECfp2006
performance - per 1 Hz) of both microarchitectures are close.
But if I'll compare SPECfp2006 results w/x86-64 microarchitecture
w/2*64 bit FP results per cycle - previous Opteron generation - I'll
see some strange (IMHO) result. So, for Opteron 2222SE/3 Ghz, AMD
SPECfp2006 values are 15.2/14.3. But Xeon 5160, having 4 FP results
per cycle, w/same 3.0 Ghz gives very close values - 15.6/15.4 !
This means that 2 additional FP results per cycle in microarchitecture
gives only about 7% of performance increase :-(
The question is - should we wait some better results for new incoming
optimizing compilers versions ? Or it is the reality - that 2
additional FP results per cycle gives (in average) relative small
performance increase ?
Zelinsky Institute of Organic Chemistry
More information about the Beowulf