[Beowulf] performance tweaks and optimum memory configs for a Nehalem

Mark Hahn hahn at mcmaster.ca
Mon Aug 10 05:41:09 PDT 2009

> (a) I am seeing strange scaling behaviours with Nehlem cores. eg A
> specific DFT (Density Functional Theory) code we use is maxing out
> performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are
> actually slower than 2 and 4 cores (depending on setup)

this is on the machine which reports 16 cores, right?  I'm guessing
that the kernel is compiled without numa and/or ht, so enumerates 
virtual cpus first.  that would mean that when otherwise idle, a 2-core
proc will get virtual cores within the same physical core.  and that 
your 8c test is merely keeping the first socket busy.

> other four cores could go to another job or stay empty. Question is
> with hyperthreading this compartmentalization is lost isn't it? So
> userA who got 4 cores could end up leeching on the other 4 cores too?
> Or am I wrong?

the kernel/scheduler is smart enough to do mostly the right thing WRT 
virtual cores.  when compiled properly...

>> It is possible that this is the result of not setting
>> processor affinity.
>> The Linux scheduler may not switch processes
>> across cores/processors efficiently.
> So let me double check my understanding. On this Nehalem if I set the
> processor affinity is that akin to disabling hyperthreading too? Or
> are these two independent concepts?

processor affinity just means restricting the set of cores a proc 
can run on.  it's orthogonal to the question of choosing the _right_ cores.

More information about the Beowulf mailing list