[Beowulf] picking out a job scheduler
csamuel at vpac.org
Thu Jan 4 16:09:52 PST 2007
On Thursday 04 January 2007 22:16, Reuti wrote:
> Linda and PVM* need some kind of rsh/ssh between the nodes, and I
> didn't get a clue up to now to convince Linda to use the PBS TM of
Torque provides a pbsdsh command that uses the TM interface and acts like the
various DSH variants. What it doesn't appear to be able to do (which I've
just discovered) is to be able to only run once per node in the job. Hmm..
> As you mentioned in your other post about keeping control of
> MPI processes, the similar thing to TM is the qrsh command in SGE,
> which will replace rsh/ssh and SGE is controlling this way these
> spawned processes on the nodes.
Sounds very similar to pbsdsh in the way it works.
> I'm also always looking in a cluster setup, without any common rsh/ssh
> between the nodes at all, where users could by accident start processes out
> of control of the queuing system on the nodes.
Exactly. What we do here is a hack in the /etc/profile that checks for the
existence of $PBS_ENVIRONMENT and kicks them off with a message about only
being permitted to access the node if you have a job on it. Ugly, but it
Newer versions of Torque have a PAM module contributed by Jim Prewett which
will check the user against the current list of Torque jobs on a node and
only permit access if they have a job on the node.
We prefer to only allow access via a PBS jobs which is why we still use our
hack, but the PAM module might be a handy backstop for us.
Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
More information about the Beowulf