[Beowulf] Visualization toolkit to monitor scheduler performance

Mark Hahn hahn at mcmaster.ca
Wed Feb 17 10:52:20 PST 2010


> http://www.msi.umn.edu/~bropers/calhoun_december.png

we've done this kind of color-job band before, and found that 
it was difficult to read.  another approach is to show jobs 
as logical blocks, rather than cpus mapped directly to y-axis:

https://www.sharcnet.ca/dynamic_images/clusterJobsPlot.saw.png

admittedly, that's not terribly pretty.  and MPI implementations
that busy-wait make the %cpu report less useful than it might be.

> We run torque with Moab and this is a result of parsing the torque
> logs. We are still going through and validating the code and adding

we run LSF, a home-grown scheduler and Maui on ~21 clusters,
and feed job data into a central DB which permanently records all 
history.  graphs like above (and others that show various usage
metrics by user/group/cluster/jobsize/jobtype) are derived from the DB.

-mark hahn.



More information about the Beowulf mailing list