job queue/farm out suggestions

Tue Jan 9 04:08:43 PST 2001

On Mon, 8 Jan 2001, Bill Comisky wrote:

> hi all,
>
> I've been using some homegrown code for quite a while now to farm out
> independent jobs (no interprocess communication required) to the nodes of
> a cluster.  I was starting to add features and improve scalability, and I
> thought I would look around to see if there was something better elsewhere
> I could use instead, maybe with some simple front end scripts.
>
> I'd like to find out what other "embarassingly parallel" cluster users use
> to farm jobs out.  My jobs are the fitness evaluations of the population
> of a single generation of a genetic algorithm optimization.

I've actually worked on (curiously enough) parallelizing the same
problem.  There is a short article on this on brahma
(www.phy.duke.edu/brahma).  I used PVM in a master/slave programming
paradigm -- this handles (or can be made to handle) the job control
features and even load balancing but not the partitioning or shared
usage problems.

However, I've also used scripts to accomplish the same kind of work.
I'd recommend using perl instead of /bin/sh -- if the genetic algorithm
you are using is written to be incredibly coarse grained (as you say,
embarrassingly parallel with generations managed via single job
submissions) this is fine.  In my case, the granularity was not so
favorable and I actually had to work some to get decent parallel gain
(the fitness evaluation didn't necessarily take a long time wrt the time
required to partition the work, farm it out, and collect the results to
ready the next generation).

Last suggestion -- consider e.g. Mosix and/or possibly a tool like
Condor for the load balancing and partitioning.  You'll probably have to
do some code crafting to really build in all the features, especially
node failure tolerance.  I plan to work on this myself over the course
of the spring.

    rgb

>
> Some of the features I would like are:
>
> - some way to know when a particular set of jobs (one generation of
> individuals in my case) is done
>
> - a way to partition the cluster so that one user gets nodes 1:N, and
> another user gets nodes N+1:M.
>
> - allow multiple users to share the entire cluster, while limiting the
> runs started on each node to that nodes memory/processor limitations.
>
> - tolerance of node failures (node goes down) or job failures (run hangs)
>
> - an open source/free solution would be preferred
>
> - process migration might be interesting to try out, but I don't think it
> is essential for what I'm doing.
>
> Right now I use some scripts and rsh, and I redirect the program input
> through rsh and either redirect the output through rsh or write remotely
> to an NFS mounted directory.  Do the queueing systems you use require NFS?
>
> I've started looking at GNU queue, anyone have experience with it?  Is
> there a way to tell if a set of jobs has been completed (would you have to
> collect PID's and check the process list continually?).
>
> MOSIX sounds interesting, but it sounds like I would still need some
> queueing routine.  Meaning, if I have 1000 jobs and 20 processors, MOSIX
> won't wait for the first 20 to finish before starting another 20,
> right?  It just tries to balance all 1000 over the cluster.
>
> thanks for any input!
> bill
>
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu