[Beowulf] Re: Grid Engine, Parallel Environment, Scheduling, Myrinet, and MPICH

Wed Mar 23 12:42:39 PST 2005

I can't get PE to work on a 50 node class II Beowulf. It has a front-end
Sunfire v40 (qmaster host) and 49 Sunfire v20s (execution hosts) running
Linux configured to communicate data over Myrinet using MPICH-GM version
1.26.14a. 

These are the requirements of the N1GE environment to handle: 

1.	Serial type jobs for pre-processing the data - average runtime 15
minutes. 
2.	Output is pipelined into parallel processing jobs - range of runtime
1- 6 hours. 
3.	Concurrently running is post-processing serial jobs. 

I have setup a Parallel Environment called mpich-gm and a straight-forward
FIFO scheduling schema for testing. When I submit parallel jobs they hang in
limbo in a 'qw' state pending submission. I am not sure why the scheduler
does not see jobs that I submit.  

I used the myrinet mpich template located $SGE_ROOT/< sge_cell >/mpi/myrinet
directory to configure the pe (parallel environment) plus I copied the
sge_mpirun script to the $SGE_ROOT/< sge_cell >/bin directory.  I configured
a Production.q queue that runs only parallel jobs. As a last sanity check I
ran a trace on the scheduler, submitted a simple parallel job, and this is
the results that I got from the logs:

JOB RUN Window

[wems at wems examples]$ qsub -now y -pe mpich-gm 1-4 -b y hello++

Your job 277 ("hello++") has been submitted.

Waiting for immediate job to be scheduled.

Your qsub request could not be scheduled, try again later.

[wems at wems examples]$ qsub -pe mpich-gm 1-4 -b y hello++

Your job 278 ("hello++") has been submitted.

[wems at wems examples]$ qsub -pe mpich-gm 1-4 -b y hello++

Your job 279 ("hello++") has been submitted.

This is the 2nd window SCHEDULER LOG

[root at wems bin]# qconf -tsm

[root at wems bin]# qconf -tsm

[root at wems bin]# cat /WEMS/grid/default/common/schedd_runlog

Wed Mar 23 06:08:55 2005|-------------START-SCHEDULER-RUN-------------

Wed Mar 23 06:08:55 2005|queue instance "all.q at wems10.grid.wni.com" dropped
because it is temporarily not available

Wed Mar 23 06:08:55 2005|queue instance "Production.q at wems10.grid.wni.com"
dropped because it is temporarily not available

Wed Mar 23 06:08:55 2005|queues dropped because they are temporarily not
available: all.q at wems10.grid.wni.com Production.q at wems10.grid.wni.com

Wed Mar 23 06:08:55 2005|no pending jobs to perform scheduling on

Wed Mar 23 06:08:55 2005|--------------STOP-SCHEDULER-RUN-------------

Wed Mar 23 06:11:37 2005|-------------START-SCHEDULER-RUN-------------

Wed Mar 23 06:11:37 2005|queue instance "all.q at wems10.grid.wni.com" dropped
because it is temporarily not available

Wed Mar 23 06:11:37 2005|queue instance "Production.q at wems10.grid.wni.com"
dropped because it is temporarily not available

Wed Mar 23 06:11:37 2005|queues dropped because they are temporarily not
available: all.q at wems10.grid.wni.com Production.q at wems10.grid.wni.com

Wed Mar 23 06:11:37 2005|no pending jobs to perform scheduling on

Wed Mar 23 06:11:37 2005|--------------STOP-SCHEDULER-RUN-------------

[root at wems bin]# qstat

job-ID prior   name       user         state submit/start at     queue
slots ja-task-ID

----------------------------------------------------------------------------
-------------------------------------

    279 0.55500 hello++    wems         qw    03/23/2005 06:11:43
1

[root at wems bin]#

BTW that node wems10.grid.wni.com has connectivity issues and I have not
removed it from the cluster queue.  

What causes this type of problem in N1GE to return "no pending jobs to
perform scheduling on" in the schedd_runlog even though there are available
slots ready to take jobs?  

I had no problem submitting serial jobs, only the parallel jobs resulted as
such. Are there N1GE - Myrinet issue that I am not aware of?  FYI the same
binary (hello++) runs with no problems from the command line.

Since I generally run scripts from qsub instead of binaries I created a
script to run the mpich executable but that yield the same result.

I have an additional question regarding setting a queue.conf parameter
called "subordinate_list". How is it read from the result of qconf -mq
<queue_name>?

Example 

            i.e., subordinate_list     low_pri.q=5,small.q.

Which queue has priority over the other based on the slots?

William Burke

Tellitec Sollutions

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050323/611516b3/attachment.html>