[Beowulf] picking out a job scheduler

Reuti reuti at staff.uni-marburg.de
Fri Jan 5 04:58:28 PST 2007


Am 05.01.2007 um 01:09 schrieb Chris Samuel:

> On Thursday 04 January 2007 22:16, Reuti wrote:
>
>> Linda and PVM* need some kind of rsh/ssh between the nodes, and I
>> didn't get a clue up to now to convince Linda to use the PBS TM of
>> Torque.
>
> Torque provides a pbsdsh command that uses the TM interface and  
> acts like the
> various DSH variants.  What it doesn't appear to be able to do  
> (which I've
> just discovered) is to be able to only run once per node in the  
> job. Hmm..

You can run it once per node with the -n option. Trying to simulate  
rsh would simply mean to map the hostname of the requested machine to  
an index in the list of granted machines - no big deal. The bigger  
problem seems to be, that there is no real environment on the nodes  
where the slave tasks are started. I.e. no environment variables set.

-- Reuti


>> As you mentioned in your other post about keeping control of
>> MPI processes, the similar thing to TM is the qrsh command in SGE,
>> which will replace rsh/ssh and SGE is controlling this way these
>> spawned processes on the nodes.
>
> Sounds very similar to pbsdsh in the way it works.
>
>> I'm also always looking in a cluster setup, without any common rsh/ 
>> ssh
>> between the nodes at all, where users could by accident start  
>> processes out
>> of control of the queuing system on the nodes.
>
> Exactly.  What we do here is a hack in the /etc/profile that checks  
> for the
> existence of $PBS_ENVIRONMENT and kicks them off with a message  
> about only
> being permitted to access the node if you have a job on it.  Ugly,  
> but it
> works.
>
> Newer versions of Torque have a PAM module contributed by Jim  
> Prewett which
> will check the user against the current list of Torque jobs on a  
> node and
> only permit access if they have a job on the node.
>
> We prefer to only allow access via a PBS jobs which is why we still  
> use our
> hack, but the PAM module might be a handy backstop for us.
>
> cheers!
> Chris
> -- 
>  Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
>  Victorian Partnership for Advanced Computing http://www.vpac.org/
>  Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit  
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list