[Beowulf] scheduler policy design

Wed Apr 25 04:07:16 PDT 2007

On Wed, 25 Apr 2007, Toon Knapen wrote:

> Joe Landman wrote:
>
>> If we can assign a priority to the jobs, so that "short" jobs get a higher 
>> priority than longer jobs, and jobs priority decreases monotonically with 
>> run length, and we can safely checkpoint them, and migrate them (via a 
>> virtual container) to another node, or restart them on one node ... then we 
>> have something nice from a throughput view point.
>
> right on. This is also exactly what the scheduler in the OS is doing. This 
> approach thus just needs to be extrapolated to a whole cluster.
>
> Does anyone know of any projects underway that are trying to accomplish 
> exactly this ?

I believe that condor does all or part of it.  It certainly does the
checkpointing and migration (subject to the code being instrumented and
compiled with their checkpointing library).  Outside of that it has a
dazzling array of policy options -- I'm expect that you can do what is
described above or something even better.

    rgb

>
> thanks,
>
> toon
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu