[Beowulf] performance tweaks and optimum memory configs for a Nehalem
hahn at mcmaster.ca
Mon Aug 10 05:41:09 PDT 2009
> (a) I am seeing strange scaling behaviours with Nehlem cores. eg A
> specific DFT (Density Functional Theory) code we use is maxing out
> performance at 2, 4 cpus instead of 8. i.e. runs on 8 cores are
> actually slower than 2 and 4 cores (depending on setup)
this is on the machine which reports 16 cores, right? I'm guessing
that the kernel is compiled without numa and/or ht, so enumerates
virtual cpus first. that would mean that when otherwise idle, a 2-core
proc will get virtual cores within the same physical core. and that
your 8c test is merely keeping the first socket busy.
> other four cores could go to another job or stay empty. Question is
> with hyperthreading this compartmentalization is lost isn't it? So
> userA who got 4 cores could end up leeching on the other 4 cores too?
> Or am I wrong?
the kernel/scheduler is smart enough to do mostly the right thing WRT
virtual cores. when compiled properly...
>> It is possible that this is the result of not setting
>> processor affinity.
>> The Linux scheduler may not switch processes
>> across cores/processors efficiently.
> So let me double check my understanding. On this Nehalem if I set the
> processor affinity is that akin to disabling hyperthreading too? Or
> are these two independent concepts?
processor affinity just means restricting the set of cores a proc
can run on. it's orthogonal to the question of choosing the _right_ cores.
More information about the Beowulf