[Beowulf] scheduler policy design
Robert G. Brown
rgb at phy.duke.edu
Wed Apr 25 04:07:16 PDT 2007
On Wed, 25 Apr 2007, Toon Knapen wrote:
> Joe Landman wrote:
>> If we can assign a priority to the jobs, so that "short" jobs get a higher
>> priority than longer jobs, and jobs priority decreases monotonically with
>> run length, and we can safely checkpoint them, and migrate them (via a
>> virtual container) to another node, or restart them on one node ... then we
>> have something nice from a throughput view point.
> right on. This is also exactly what the scheduler in the OS is doing. This
> approach thus just needs to be extrapolated to a whole cluster.
> Does anyone know of any projects underway that are trying to accomplish
> exactly this ?
I believe that condor does all or part of it. It certainly does the
checkpointing and migration (subject to the code being instrumented and
compiled with their checkpointing library). Outside of that it has a
dazzling array of policy options -- I'm expect that you can do what is
described above or something even better.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
Robert G. Brown http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
More information about the Beowulf