[Beowulf] performance tweaks and optimum memory configs for a Nehalem
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Håkon Bugge h-bugge at online.noTue Aug 11 00:43:03 PDT 2009
- Previous message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Next message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Aug 10, 2009, at 23:07 , Tom Elken wrote: > Summary: > IBM, SGI and Platform have some comparisons on clusters with "SMT > On" of running 1 rank for every core compared to running 2 ranks on > every core. In general, on low core-counts, like up to 32 there is > about an 8% advantage for running 2 ranks per core. At larger core > counts, IBM published a pair of results on 64 cores where the 64- > rank performance was equal to the 128-rank performance. Not all of > these applications scale linearly, so on some of them you lose > efficiency at 128 ranks compared to 64 ranks. > > Details: Results from this year are mostly on Nehalem: > http://www.spec.org/mpi2007/results/res2009q3/ (IBM) > http://www.spec.org/mpi2007/results/res2009q2/ (Platform) > http://www.spec.org/mpi2007/results/res2009q1/ (SGI) > (Intel has results with Turbo mode turned on and off > in the q2 and q3 results, for a different comparison) > > Or you can pick out the Xeon 'X5570' and 'X5560' results from the > list of all results: > http://www.spec.org/mpi2007/results/mpi2007.html > > In the result index, when > " Compute Threads Enabled" = 2x "Compute Cores Enabled", then you > know SMT is turned on. > In these cases, you can then check that when > " MPI Ranks" = " Compute Threads Enabled" then you are running 2 > ranks per core. Tom, Thanks for the neatly compiled information above. I can just add, that I have conducted a fairly detailed analysis of Nehalem compared to HarperTown in my paper An evaluation of Intel’s core i7 architecture using a comparative approach presented at ISC´09. Here, I look at different aspect of the memory hierarchy of the two processors. The benefits from hyperthreading on the said 13 SPEC MPI2007 applications are also studied, although using only a single node, where the advantage is more pronounced Thanks, Håkon -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20090811/227e7fb7/attachment.html
- Previous message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Next message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
