Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] performance tweaks and optimum memory configs for a Nehalem

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Håkon Bugge h-bugge at online.no
Tue Aug 11 00:43:03 PDT 2009


On Aug 10, 2009, at 23:07 , Tom Elken wrote:
> Summary:
> IBM, SGI and Platform have some comparisons on clusters with "SMT  
> On" of running 1 rank for every core compared to running 2 ranks on  
> every core.  In general, on low core-counts, like up to 32 there is  
> about an 8% advantage for running 2 ranks per core.  At larger core  
> counts, IBM published a pair of results on 64 cores where the 64- 
> rank performance was equal to the 128-rank performance.  Not all of  
> these applications scale linearly, so on some of them you lose  
> efficiency at 128 ranks compared to 64 ranks.
>
> Details: Results from this year are mostly on Nehalem:
> http://www.spec.org/mpi2007/results/res2009q3/ (IBM)
> http://www.spec.org/mpi2007/results/res2009q2/ (Platform)
> http://www.spec.org/mpi2007/results/res2009q1/ (SGI)
>  (Intel has results with Turbo mode turned on and off
>    in the q2 and q3 results, for a different comparison)
>
> Or you can pick out the Xeon 'X5570' and 'X5560' results from the  
> list of all results:
> http://www.spec.org/mpi2007/results/mpi2007.html
>
> In the result index, when
> " Compute Threads Enabled" = 2x "Compute Cores Enabled", then you  
> know SMT is turned on.
> In these cases, you can then check that when
> " MPI Ranks" = " Compute Threads Enabled" then you are running 2  
> ranks per core.


Tom,

Thanks for the neatly compiled information above. I can just add, that  
I have conducted a fairly detailed analysis of Nehalem compared to  
HarperTown in my paper An evaluation of Intel’s core i7 architecture  
using a comparative approach presented at ISC´09. Here, I look at  
different aspect of the memory hierarchy of the two processors. The  
benefits from hyperthreading on the said 13 SPEC MPI2007 applications  
are also studied, although using only a single node, where the  
advantage is more pronounced

Thanks,


Håkon



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20090811/227e7fb7/attachment.html


More information about the Beowulf mailing list