PBS

Craig Tierney ctierney at hpti.com
Wed Jun 13 10:18:37 PDT 2001


We do a similar thing here at FSL.  We have
two sets of nodes (compsingle and compdual).
They are single and dual nodes and the dual
nodes have faster CPUs.  We do not want users 
to have to worry about to which nodes they are
submitting jobs.   

We do this with a scheduler.  You could
do it with a pipe queue as well if you
create an alternate router and have that
process do the work.  

1) Set the properties to all nodes to either
   switch1 or switch2 depending on the location.
   Also, add the property comp to every node.

2) Turn off the default scheduler.

      set server scheduling = False

3) Write a small script (c-shell, perl) for
   every iteration:

   - Takes in all available jobs in the pipe queue (qstat -a)
   - Determines the number of nodes available (pbsnodes)
   - decides where the job is going to go, switch1 or switch2 (qstat -f <jobid>)
       You can ignore jobs at this step if there are not
       enough resources available to run the job.
   - runs qalter on the job to rewrite the -lnodes line (qalter)
   - Run the job to the execution queue. (qrun)
   - sleep 

When users run their jobs, they will submit to the
generic 'comp' nodes.  

Since you are using node properties to specify which nodes
are being used, only one execution queue is needed.

Sorry you didn't get an easy answer.  If you know perl
ripping though the qstat and pbsnodes information shouldn't
be too difficult.

Craig



On Wed, Jun 13, 2001 at 12:36:59PM -0400, Joey Raheb wrote:
> I've posted this message up a couple of times already with no response, I'll
> try one more time.  I would like to try something new with PBS.  We will soon
> be adding more computers to our cluster and therefore will require another
> switch.  Since the bottleneck in the communication exists in the switch-switch
> communication (1000 MB/s duplex) we would like to run all parallel jobs within
> the same switch since bandwidth is near theoretical max within the switch.  My
> idea was to create two execution queues named switch1 and switch2 (each would
> only send jobs to nodes on each respective switch) and one route queue which
> would direct jobs to either switch1 or switch2 depending on the number of
> nodes requested and the free number of nodes available.  I thought there would
> be an easy way to do this, but I am finding that there might not be.  Maybe
> PBS is not the correct queueing system, does anybody have any suggestions or
> ideas of how I can implement the above??
> 
> Joey
> 
> ____________________________________________________________________
> Get free email and a permanent address at http://www.amexmail.com/?A=1
> 
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf

-- 
Craig Tierney (ctierney at hpti.com)
phone: 303-497-3112




More information about the Beowulf mailing list