[Beowulf] AMD64 results...
ctierney at HPTI.com
Thu Dec 16 08:12:21 PST 2004
On Wed, 2004-12-15 at 21:35, Mark Hahn wrote:
> > > They are all below. Executive summary is that the AMD barely beats
> > > (real) clock speed scaling compared to the P2 for stream. I suspect
> sure - stream is normally dram test, not a CPU test.
> > Double registers only help if you need them. Most codes won't
> > automatically utilize native 64 bit ints or pointers to any
> > significant advantage.
> indeed, going 64b often costs a noticable overhead in code size
> expansion and inflation of space to store pointers.
> the real appeal of x86-64 is that you get twice as many registers.
> yes, being able to actually use more than about 2.5 GB is nice,
> and important to some people. but almost any real code will take
> advantage of having twice as many registers (integer and SIMD).
> > or with a 2.6 Kernel (which is better about insuring that pages and the
> > process acting on the page is on the same cpu).
> don't forget to turn on node interleave in the bios, too.
Why? If you are planning to have a single process access
the memory of all of the nodes (cpus) then yes. If you are
running MPI jobs or multiple processes that stay local to their
own memory, they don't you want bank interleave on but node
I have seen better performance for MPI jobs, 1 process per
cpu, with node interleave off.
> > Kudos for the pathscale-1.4 compiler with -O3.
> ironically, icc -xW generates pretty good-for-opteron code,
> though of course, it's 32b. I haven't tried using icc to
> generate em64t/and64 code.
> regards, mark hahn.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
More information about the Beowulf