[Beowulf] performance tweaks and optimum memory configs for a Nehalem
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Tom Elken tom.elken at qlogic.comMon Aug 10 14:07:23 PDT 2009
- Previous message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Next message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Well, as there are only 8 "real" cores, running a computationally > intensive process across 16 should *definitely* do worse than across 8. Not typically. At the SPEC website there are quite a few SPEC MPI2007 (which is an average across 13 HPC applications) results on Nehalem. Summary: IBM, SGI and Platform have some comparisons on clusters with "SMT On" of running 1 rank for every core compared to running 2 ranks on every core. In general, on low core-counts, like up to 32 there is about an 8% advantage for running 2 ranks per core. At larger core counts, IBM published a pair of results on 64 cores where the 64-rank performance was equal to the 128-rank performance. Not all of these applications scale linearly, so on some of them you lose efficiency at 128 ranks compared to 64 ranks. Details: Results from this year are mostly on Nehalem: http://www.spec.org/mpi2007/results/res2009q3/ (IBM) http://www.spec.org/mpi2007/results/res2009q2/ (Platform) http://www.spec.org/mpi2007/results/res2009q1/ (SGI) (Intel has results with Turbo mode turned on and off in the q2 and q3 results, for a different comparison) Or you can pick out the Xeon 'X5570' and 'X5560' results from the list of all results: http://www.spec.org/mpi2007/results/mpi2007.html In the result index, when " Compute Threads Enabled" = 2x "Compute Cores Enabled", then you know SMT is turned on. In these cases, you can then check that when " MPI Ranks" = " Compute Threads Enabled" then you are running 2 ranks per core. -Tom > However, it's not so surprising that you're seeing peak performance > with > 2-4 threads. Nehalem can actually overclock itself when only some of > the > cores are busy -- it's called Turbo Mode. That *could* be what you're > seeing. > > -- > Joshua Baker-LePain > QB3 Shared Cluster Sysadmin > UCSF
- Previous message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Next message: [Beowulf] performance tweaks and optimum memory configs for a Nehalem
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
