[Beowulf] Re: scheduler and perl
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
David Mathog mathog at caltech.eduWed Aug 2 09:51:26 PDT 2006
- Previous message: [Beowulf] Mid-summer Monkey News
- Next message: [Beowulf] new release of GAMMA and MPI/GAMMA
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Xu, Jerry" wrote: > > Hi, I am maintaining a cluster while lots user uses perl to submit tons of jobs > which seems to me like abusing the system. The qsub in SGE, and probably others, allows a repeat count to be set with a single qsub. So if they are using perl to qsub 1000 jobs which correspond to i=1,1000 have them use just one qsub with the repeat count method instead. Additionally, if they are setting up thousands of jobs, each of of which runs for very short times (< one second) through the queue system it will be much better to have them submit scripts that run N of those processes in a chunk within a single qsub job. That is, if you have 100 nodes and there are 1000 jobs to run, they might run 20 in each of 50 jobs (or some other similar mix.) There is some overhead and typically >=1 second wait times built into most queue systems, and they work better when the jobs are "long" compared to these times. I ran into both of these issues with my parallelblast implementation. SGE just couldn't start the jobs on the nodes fast enough, so it ended up using an outer SGE wrapper to start the "mother" job which then used PVM to start the individual jobs on each node. That's sort of an odd application though as it had to run in a certain way on all nodes at more or less the same time. Other than that, you may want to have a users meeting where the various types of jobs run on the system are described by the people who run them, so that some rational load sharing policy can be worked out. Not so much "who gets the most time" - which is pure politics and good luck with that. Rather: "how not to run jobs in such a way that they hog the system for no good reason, keeping others from getting work done." Regards, David Mathog mathog at caltech.edu Manager, Sequence Analysis Facility, Biology Division, Caltech
- Previous message: [Beowulf] Mid-summer Monkey News
- Next message: [Beowulf] new release of GAMMA and MPI/GAMMA
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
