[Beowulf] assigning cores to queues with torque

Wed Mar 10 13:20:56 PST 2010

On Mon, 8 Mar 2010 10:39:08 -0500
Glen Beane <Glen.Beane at jax.org> wrote:

> 
> 
> 
> On 3/8/10 10:14 AM, "Micha Feigin" <michf at post.tau.ac.il> wrote:
> 
> I have a small local cluster in our lab that I'm trying to setup with minimum
> hustle to support both cpu and gpu processing where only some of the nodes have
> a gpu and those have only two gpu for four cores.
> 
> It is currently setup using torque from ubuntu (2.3.6) with the torque supplied
> scheduler (set it up with maui initially but it was a bit of a pain for such a
> small cluster so I switched)
> 
> This cluster is used by very few people in a very controlled environment so I
> don't really need any protection from each other, the queues are just for
> convenience to allow remote execution
> 
> The problem:
> 
> I want to allow gpu related jobs to run only on the gpu equiped nodes (i.e more jobs then GPUs will be queued), I want to run other jobs on all nodes with either
> 1. a priority to use the gpu equiped nodes last
> 2. or better, use only two out of four cores on the gpu equiped nodes
> 
> It doesn't seem though that I can map nodes or cores to queues with torque as far as I can tell
> (i.e cpu queue uses 2 cores on gpu1, 2 cores on gpu2, all cores on everything else
>       gpu queue uses 2 cores on gpu1, 2 cores on gpu2)
> 
> I can't seem to set user defined resources so that I can define gpu machines as having gpu resource and schedule according to that.
> 
> Is it possible to achieve any of these two with torque, or is there any other
> simple enough queue manager that can do this (preferably with a debian package
> in some way to simplify maintanance). I only manage this cluster since no one
> else knows how to and it's supposed to take as little of my time as possible
> I'm looking for the simplest solution to implement and not the most versatile
> one.
> 
> 
> you can define a resource "gpu" in your TORQUE nodes file:
> 
> hostname np=4 gpu
> 
> and then users can request -l nodes=1:ppn=4:gpu to get assigned a node with a gpu,  but to do anything more advanced you'll need Maui or Moab.   You should try the maui users mailing list, or the torque users mailing list to see if anyone else has some ideas

Thanks, almost perfect. It would have been a complete solution if there was a
way to define how many such resources there are as there are 4 cores and 2 GPUs
per node. Its good enough for now though as it works perfect when asking for
nodes=1:ppn=2 to make sure that I don't get too many GPU jobs. This is a
cluster that is used by 3 people that are cooperating at the moment so I can
waste the extra core for now to spare man hours for the setup of maui.