[Beowulf] Athlon64 / Opteron test

Lombard, David N david.n.lombard at intel.com
Fri May 14 13:28:49 PDT 2004


From: Joe Landman; Friday, May 14, 2004 10:33 AM
[deletia]
>   I see you are using an older (ancient) kernel for the e325.  I wonder if
> you
> have all of your memory physically attached to one versus two processors.
> Lots
> of folks see slowdowns of some sort when they don't set up banking.

While I don't know about these specific results, I was at MSC.Software when Opteron benchmarks were being performed on AMD-provided systems.  MSC, AMD, and IBM all spent a lot of time and effort on the benchmarks, with AMD providing hardware, OS, and support to ensure the results were as good as possible.  The kernel was specifically NUMA-capable, as provided by AMD, with memory arranged per AMD specifications, with much experimentation by all involved to ensure best result.  I can only assume that IBM and AMD worked equally well on the benchmarks presented on the page.

>   Measurements we have done on non-engineering codes using the GCC, PGI,
> and
> other compilers have shown that Xeons and Opterons are generally similar
> for 32
> bit codes, to within a few percent.  When you recompile with the -m64
> option
> that you get some 5-30% advantages.

Over the many years I worked at MSC, specifically concerned w/ Nastran performance, I saw many disconnects between performance on non-CAE applications and Nastran.  Through the combined efforts of MSC.Software's internal experts, like Joe Griffin, and vendor optimization experts, MSC.Nastran usually did a fairly good job of achieving maximal performance -- that was certainly my personal goal.  For the record, one particular platform actually achieved about 98% peak theoretical performance on a dense vector-matrix operation.  Note I wrote usually, MSC.Nastran demands a lot from a computer.  It could also highlight weaknesses, especially those related to "machine balance". One relatively unknown, but quite notorious, example from the '90s had an order-of-magnitude difference between CPU and elapsed time due to a customer's specific benchmarking requirements and an OEM's poor I/O performance.

>   As you said, YMMV.  Its also quite easy to misconfigure these machines,
> and I
> have seen this in a number of benchmarks.  The memory configuration can
> (drastically) impact performance.  I might also suggest updating to a
> modern
> kernel, one with at least some NUMA aware functionality.  2.4.18/2.4.19
> was used
> in the RedHat GinGin64 series.  This was not a good OS platform for
> benchmarking.

As mentioned above, AMD, IBM, and MSC worked long and hard on the benchmarks, with all having the goal of obtaining the best showing.  While I'm in no position to guarantee such naïve problems did not occur, especially as I am no longer at MSC.Software, too many experts were too interested in the results for that to be likely.

-- 
David N. Lombard
 
My comments represent my opinions, not those of Intel Corporation.




More information about the Beowulf mailing list