[Beowulf] [EXTERNAL] Re: PBS question

Ellis H. Wilson III ellis at ellisv3.com
Tue Oct 29 14:26:03 PDT 2019


On 10/29/19 4:49 PM, Lux, Jim (US 337K) via Beowulf wrote:
> True, there’s tons of info in qstat -f, however, doesn’t qstat stop 
> showing my job after it completes, though? Maybe there’s a switch that 
> retrieves “last data”?

Hi Jim,

I think you're looking for tracejob.  Without sufficient perms you won't 
be able to get access to accounting, but should still get the info you 
need from other logs it queries.

Here's real usage of it, albeit snipped extensively.  It shows memory 
and cpu usage at the end, though it won't say how many cores you used. 
IMHO that's something you design for.  If you find cpu usage to be way 
lower than runtime, and your code scales out to the number of cores 
available, you can request less cores until your cpu time roughly 
approximates your run-time.

ellisw at snip ~ $ sudo tracejob -n1 2100762.snip.panasas.com
/var/spool/torque/mom_logs/20191029: No matching job records located
/var/spool/torque/sched_logs/20191029: No such file or directory

Job: 2100762.snip.panasas.com

10/29/2019 16:33:32  S    enqueuing into route, state 1 hop 1
10/29/2019 16:33:32  S    dequeuing from route, state QUEUED
10/29/2019 16:33:32  S    enqueuing into eng, state 1 hop 1
10/29/2019 16:33:32  S    Job Queued at request of 
snip at snip.panasas.com, owner = snip at snip.panasas.com, job name = 
pr_one_run, queue = eng
10/29/2019 16:33:32  A    queue=route
10/29/2019 16:33:32  A    queue=eng
10/29/2019 17:16:03  S    Job Run at request of root at snip.panasas.com
10/29/2019 17:16:06  S    Not sending email: job requested no e-mail
10/29/2019 17:16:06  A    user=snip group=users jobname=pr_one_run 
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212 
start=1572383766 owner=snip at snip.panasas.com exec_host=snip/0
 
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr 
Resource_List.nodect=1 
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr 
Resource_List.walltime=02:00:00
10/29/2019 17:17:17  S    Not sending email: job requested no e-mail
10/29/2019 17:17:17  S    Exit_status=0 resources_used.cput=00:00:11 
resources_used.mem=1092436kb resources_used.vmem=2817552kb 
resources_used.walltime=00:01:14
10/29/2019 17:17:17  A    user=snip group=users jobname=pr_one_run 
queue=eng ctime=1572381212 qtime=1572381212 etime=1572381212 
start=1572383766 owner=snip at snip.panasas.com exec_host=snip/0
 
Resource_List.neednodes=1:freebsd_104_amd64:ppn=1:pfsr 
Resource_List.nodect=1 
Resource_List.nodes=1:freebsd_104_amd64:ppn=1:pfsr 
Resource_List.walltime=02:00:00 session=21205 end=1572383837 
Exit_status=0 resources_used.cput=00:00:11
                           resources_used.mem=1092436kb 
resources_used.vmem=2817552kb resources_used.walltime=00:01:14
10/29/2019 17:17:18  S    dequeuing from eng, state COMPLETE

Best,

ellis

-- 
Ellis H. Wilson III, Ph.D.
      www.ellisv3.com


More information about the Beowulf mailing list