[Beowulf] Again about NUMA (numactl and taskset)

Patrick Geoffray patrick at myri.com
Thu Jun 26 23:39:39 PDT 2008

Hi Hakon,

Håkon Bugge wrote:
> This is information we're using to optimize how pnt-to-pnt communication 
> is implemented. The code-base involved is fairly complicated and I do 
> not expect resource management systems to cope with it.

Why not ? It's its job to know the resources it has to manage. The 
resource manager has more information than you, it does not have to 
detect at runtime for each job, and it can manage cores allocation 
across jobs. You cannot expect the granularity of the allocation to stay 
at the node level with the core count increasing.

If the MPI implementation does the spawning, it should definitively have 
support to enforce core affinity (most do AFAIK). However, core affinity 
should be dictated by the scheduler. Heck, the MPI implementation should 
not do the spawning in the first place.

Historically, resource managers have been pretty dumb. These days, there 
is enough competition in this domain to expect better.


