[Beowulf] VMC - Virtual Machine Console

Chris Samuel csamuel at vpac.org
Sat Feb 23 19:56:01 PST 2008


----- "Rayson Ho" <raysonlogin at gmail.com> wrote:

> I am working on adding processor affinity support for serial and
> parallel jobs for Grid Engine, and I am working with the OpenMPI
> developers to define an interface.

FWIW the Torque approach (currently in trunk in SVN) is to
not use cpu affinity but instead use the cpuset support
in most modern Linux kernels.

So once you've got /dev/cpuset created and have mounted the
VFS with "mount -t cpuset - /dev/cpuset" the new pbs_mom
will automatically create (if it doesn't already exist)
a "torque" cpuset with all the CPUs in it.

It then creates job cpusets beneath that for each job
and a "vnode" (aka per-process) cpuset for each process
created.

So, on an 8 core box running a 4 CPU MPI job you'd
end up with:

/dev/cpuset/torque (8 cores)
/dev/cpuset/torque/1.cluster-m.foo.edu/ (4 cores)
/dev/cpuset/torque/1.cluster-m.foo.edu/1/ (1 core)
/dev/cpuset/torque/1.cluster-m.foo.edu/2/ (1 core)
/dev/cpuset/torque/1.cluster-m.foo.edu/3/ (1 core)
/dev/cpuset/torque/1.cluster-m.foo.edu/4/ (1 core)

SMP processes would end up in the job set whereas
processes launched via PBS's TM API would end up
in their appropriate vnode set.

So if a user launches what they think is a single
CPU serial job that actually turns out to be a code
that detects how many cores are in a system and then
uses all of them it will no longer affect other users
code on the system - their job will just take a hammering
instead! :-)

cheers,
Chris
-- 
Christopher Samuel - (03) 9925 4751 - Systems Manager
 The Victorian Partnership for Advanced Computing
 P.O. Box 201, Carlton South, VIC 3053, Australia
VPAC is a not-for-profit Registered Research Agency



More information about the Beowulf mailing list