<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Arial; font-size: 12pt; color: #000000'><br>Micha Feigin wrote:<br><br>>The problem:<br>><br>>I want to allow gpu related jobs to run only on the gpu<div>>equiped nodes (i.e more jobs then GPUs will be queued),</div><div>>I want to run other jobs on all nodes with either:</div><div>><br>>  1. a priority to use the gpu equiped nodes last<br>>  2. or better, use only two out of four cores on the gpu equiped nodes</div><div><br></div><div>In PBS Pro you would do the following (torque may have something</div><div>similar):</div><div><br></div><div>1.  Create a custom resource called "ngpus" in the resourcedef</div><div>      file as in:</div><div><br></div><div>        ngpus<span class="Apple-tab-span" style="white-space:pre">       </span>type=long<span class="Apple-tab-span" style="white-space:pre">   </span>flag=nh</div><div><br></div><div>2.  This resource should then be explicitly set on each node that</div><div>      includes a GPU to the number it includes:</div><div><br></div><div>       set node compute-0-5 resources_available.ncpus = 8</div><div>       set node compute-0-5 resources_available.ngpus = 2</div><div><br></div><div>      Here I have set the number of cpus per node (8) explicitly to defeat</div><div>      hyper-threading and the actual number of gpus per node (2).  On the </div><div>      other nodes you might have:</div><div><br></div><div><div>       set node compute-0-5 resources_available.ncpus = 8</div><div>       set node compute-0-5 resources_available.ngpus = 0</div><div><br></div><div>       Indicating that there are no gpus to allocate.</div></div><div><br></div><div>3.  You would then use the '-l select' option in your job file as follows:</div><div><br></div><div>      #PBS  -l select=4:ncpus=2:ngpus=2</div><div><br></div><div>      This requests 4 PBS resource chunks.  Each includes 2 cpus and 2 gpus.</div><div>      Because the resource request is "chunked" these 2 cpu x 2 gpu chunks would</div><div>      be placed together on one physical node.  Because you marked some </div><div>      nodes as having 2 gpus in the nodes file and some to have 0 gpus, only those</div><div>      that have them will get allocated.  As a consumable resource, as soon as 2</div><div>      were allocated the total available would drop to 0.   In total you would have</div><div>      asked for 4 chunks distributed to 4 physical nodes (because only one of these</div><div>      chunks can fit on a single node).  This also ensures a 1:1 mapping of cpus to </div><div>      gpus, although it does nothing about tying each cpu to a different socket. You</div><div>      would to do that in the script with numactl probably.</div><div><br></div><div>There are other ways to approach by tying physical nodes to queues, which you</div><div>might wish to do to set up a dedicate slice for GPU development.  You may also</div><div>be able to do this in PBS using the v-node abstraction.  There might be some </div><div>reason to have two production routing  queues that map to slight different parts</div><div>of the system.</div><div><br></div><div>Not sure how this could be approximated in Torque, but perhaps this will give you</div><div>some leads.</div><div><br></div><div>rbw</div><div>_______________________________________________<br>Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing<br>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf<br></div></div></body></html>