[Beowulf] scheduler and perl

Joe Landman landman at scalableinformatics.com
Tue Aug 1 14:48:58 PDT 2006


Hi Jerry:

Xu, Jerry wrote:
> Hi, Thanks, Joe.
>  I am not meaning to "ban" anything immediately, I am just curious how often
> this happen to the HPC community.
> Perl/shell is really strong tool, one example is to use loop to submit huge
> mount of jobs and puts burden on scheduler server,

Thats what the scheduler is for though.  Some can't handle large loads 
of jobs very well.  We have had no trouble with users/customers dumping 
thousands of jobs into LSF and SGE.  Other schedulers may or may not be 
able to handle this well.  I have had conversations with some folks who 
believe that one should never have more than 50 or so jobs in queue at 
any one time.  I don't agree with that, but they indicated that their 
queuing system breaks if they tried.

> the other example is to have
> one job sit idle and frequently to use system call to detect the job status and
> resubmit jobs again and again; 

Depends upon whether or not it runs in a scheduler bubble.  That is, 
under your scheduler, do your node allocations.  Then have your server 
thread handle distribution to client threads on the allocated nodes. 
This is fine if they implement it well.  mpiBLAST is a variant of this 
using MPI and an internal scheduler.  You can run it nicely in an 
existing larger resource manager.

> the other example is that use system call and ssh
> to each node and run stuff and bypass the scheduler... 

Ok, this one isn't good.  You should see if they can be persuaded to 
work within the job scheduler via some method.  Otherwise it can be painful.

> It just drives me crazy
> sometime. 
> 

Understood.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615



More information about the Beowulf mailing list