[Beowulf] block runing jobs individually on each node

Chris Dagdigian dag at sonsorol.org
Thu Apr 7 12:14:13 PDT 2005


Stopping people from gaming or bypassing the cluster scheduler is 
possible via various methods (sorry I can't help you out with a PBS 
specific method!) but in the long run it is an arms race between you and 
the misbehaving users that you will probably never win completely and 
willl sap lots of your time and effort.

My take on this issue has always been that this is a policy issue, not a 
technology issue.

If you have a written policy that says "bypassing the scheduler results 
in an unfair allocation of shared resources that hurts your fellow 
users" then you have a framework for dealing with abusers. Typically 
this means a gentle note to the manger/advisor of the user. Further 
abuses result in user account suspension.

I know people who do both methods - on some Grid Engine clusters any 
user process running on a compute node that is not a child of the proper 
sge_sheperd daemon gets a "kill -9" signal sent to it. Users get the 
message quickly.

In general though, I think the admins who deal with this problem as a 
policy issue are overall "happier" and have a better relationship with 
the user community as well.

Just my $.02


-Chris






jerry xu wrote:

> Hi, Dear All:
>    I am managing a simple 24 nodes beowulf cluster, basically I require
> all my jobs are running through PBS. However, some undergraduate
> students in our lab always try to ssh to each individual node in the
> cluster and run their jobs, which is pretty bad for me to managing the
> resources and control my program running status. I remember there is way
> to block people running job that is seperated from the batch system but
> at the same time still allow them ssh to each node to grab some tmp
> files?. But I just donot remember how to do it, can anyone give some
> directions?
> 




More information about the Beowulf mailing list