[Beowulf] scheduler policy design
reuti at staff.uni-marburg.de
Thu Apr 26 02:18:48 PDT 2007
Am 25.04.2007 um 17:15 schrieb Chris Dagdigian:
> Mind you, they did not invest all this effort into wrapping SGE
> just to "hide complexity" from the users or even just to get
> backfill working efficiently. By rigidly controlling the syntax of
> the job submission commands they were able to squeeze a lot of
> value out of their workflows -- simple things like having a
> consistent and 100% uniform job naming scheme made processing the
> accounting logs, debugging and troubleshooting far more efficient.
We also use wrappers for the job submission, but "hiding the
complexity" is our key point to make the life of the students and
scientists easier. They can this way spend more time in the research
and concentrate on their work, instead of trying to understand all
possible parameters which they could use for the jobs. We don't
forbid to create scripts - if they like to do it and prefer it, they
can do it. If a student just learn how to write an inputfile for e.g.
Gaussian, they can concentrate on this task, and can be sure that the
submitted job has all necessary parameters for the (site specific)
> Implementing this stuff tends to be site specific or workflow
> specific. There is no easy one size fits all solution. Depends on
> your apps, your execution host OS and your scheduling system (and
> may other factors).
> People have all sorts of pie in the sky impressions as to how this
> stuff "should" work but their ideas tend to smash against the hard
> reality that very few applications can currently be seamlessly
> checkpointed, suspended, restarted and migrated without error. If
> you can't easily freeze an application and transparently move it
> to another node then all the fancy academic ideas about advanced
> reservation, backfill etc. all get real inefficient real fast in
> production computing environments.
Full agreement to both statements.
More information about the Beowulf