[Beowulf] Benchmark between Dell Poweredge 1950 And 1435
bill at cse.ucdavis.edu
Mon Mar 12 13:40:42 PDT 2007
Renato S. Silva wrote:
> What is the best compiler for AMD processors ?
For which language (java, c, c++, f90, or ???)
Using which flags? Just -O? This can make a big difference, do you spend
time searching the optimization flag space? With or without profiling?
For which application?
I'm sure for any compiler I could find a counter example. I've seen plenty
of data points strongly in one direction or another, but nothing particularly
recent. At one point portland group was substantially behind, but I suspect
things have improved since then. I've definitely used codes that recommend
Portland's compiler because it's known correct for that code. I'd try to look
at benchmark results in your application area if available. Spec publishes
numbers that include compiler version, and flags which can provide a data
point, but the real world relevance is likely to be low (unless you run spec
all day long)
I'm quite happy with the pathscale compiler, but I've not heard anything
bad about the competition though. Sun wins some benchmarks, as does intel
(even on AMD), and I suspect the rest of the competition does as well.
I've done a fair amount of testing with memory systems and I've found
pathscale to be pretty impressive on large static arrays, with or without
openmp, stream like benchmarks scale really well (to 2 or 4 sockets).
I was quite depressed to see around 1/2 the performance when I used dynamic
arrays (allocated with c++ new or malloc). Then I tested g++, gcc, intel's
compiler, and I believe at least one other and nobody else did any better, so
I can't really complain.
If anyone knows how to get great memory performance on dynamically allocated
arrays please post. I.e. stream like bandwidth of around 2/3rd of peak.
Seems like a hidden performance penalty that would effect a large number
of codes (any that dynamically allocate arrays). At least I never learned
that dynamically allocating arrays could cause a huge performance hit (I'm
not including the time to malloc or free).
More information about the Beowulf