[Beowulf] Again about NUMA (numactl and taskset)
diep at xs4all.nl
Tue Jun 24 14:28:46 PDT 2008
What would interest me is if you describe how you get your information
on how instructions pair and what weak sequences are at the processor.
Like for example the by now old AMD K8 used to have the feature that
if you do an
integer multiplication, that the first and last cycle of the latency,
it blocks all other execution units.
Besides that this still kicks butt compared to intel's implementation
of integer multiplication
(proof of this statement: Most of GMP's functions are integer
multiplication dominated and AMD k8 already
murders core2 as a result of that), this is total crucial to know
when building a compiler.
Did this information already sit in Pathscales database of "processor
If so, where did you get the knowledge?
On Jun 24, 2008, at 11:07 PM, Greg Lindahl wrote:
> On Tue, Jun 24, 2008 at 10:21:01PM +0200, Vincent Diepeveen wrote:
>> The PG compiler and especially pathscale compiler are doing rather
>> well at benchmarks,
>> especially that last, yet at our codes they're real ugly. Maybe they
>> do better for floating point
>> oriented workloads, which doesn't describe game tree search.
> There are certainly unusual codes out there, and PathScale has gotten
> a lot of examples sent in by customers, thanks to the "if we're slower
> than someone else, it's a bug" philosophy. This allowed us to improve
> the compiler on a lot of non-benchmark codes.
> In your case, I'd suggest that you use pathopt to search for better
> -- greg
More information about the Beowulf