Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

PBS Scheduler

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Ivan Oleynik oleynik at chuma.cas.usf.edu
Sun Sep 29 18:04:37 PDT 2002


Joseph,

Thanks very much for your reply. I made the following changes to
xinted.conf file:

instances 200 (original value - 60)
cps 50 30 (original values 25 30 )

First run of my test went well with no problem reported from pbs
scheduler. But during the second run the same problem has appeared:
pbs_scheduler went down.

Do you think that the above parameters are not large enough to keep the
nfs traffic from 36 processors simultaneously writing 300 Mb per
processor?

Ivan 

------------------------------------------------------------------------

On 27 Sep 2002, Joseph Landman wrote:

> On Fri, 2002-09-27 at 07:27, Ivan Oleynik wrote:
> > Hi,
> > 
> > I have a problem with PBS scheduler: every time when I run IO intensive
> > series of jobs it goes down. As a result, the whole pbs queue with other
> > jobs become suspended.
> > 
> > I could not see any useful info in sched_logs and server_logs files except
> > of noninformative messages:
> > 
> > 0001;PBS_Server;Svr;PBS_Server;Connection refused (111) in contact_sched,Could not contact Scheduler
> 
> This is actually quite informative.  What I have experienced in the past
> with PBS and heavy NFS loads is that the cluster head node runs out of
> tcp/udp slots as specified in the /etc/inetd.conf or /etc/xinetd.conf
> files.  Depending upon which one you use, you will need to bump those
> limits up a bit. 
> 
> > For this particular test I run a bunch of mpich jobs requesting just 1
> > processor per job, and the number of the submitted jobs was 6 times the
> > number of available nodes. Each job does intensive IO via NFS running over
> > Myrinet (writing files ~ 300 Mb each).
> 
> [...]
> 
> -- 
> Joseph Landman, Ph.D
> Scalable Informatics LLC
> email: landman at scalableinformatics.com
>   web: http://scalableinformatics.com
> phone: +1 734 612 4615
> 
> 
------------------------------------------------------------------------
Ivan I. Oleynik                       E-mail : oleynik at chuma.cas.usf.edu
Department of Physics
University of South Florida
4202 East Fowler Avenue                  Tel : (813) 974-8186
Tampa, Florida 33620-5700                Fax : (813) 974-5813
------------------------------------------------------------------------




More information about the Beowulf mailing list