[Beowulf] Cluster Metrics? (Upper management view)
Bill.Rankin at sas.com
Mon Aug 23 06:57:01 PDT 2010
Michael Di Domenico wrote:
> I think measuring a clusters success based on the number of jobs run
> or cpu's used is a bad measure of true success. I would be more
> inclined to consider a cluster a success by speaking with the people
> who use it and find out not only whether they can use it effectively
> and/or what new science having cluster is being enabled by them.
Bingo. In a former life I was director of an fairly large academic cluster facility. One of the things I always dreaded were writing up the monthly and quarterly (and annual) reports. These reports were basically used to justify our existence to the upper university administration. Here are some of the things I included:
First section was a summary of the computational usage of the cluster, broken down by research group. In our case this data was dumped from the SGE report system. Include monthly usage and year to date. If you haven't already, break down your users into groups based upon research area and/or application. Use Unix group IDs to identify them and create access groups within SGE or whatever workload manager you are using. A lot of this can be scripted and run under cron.
Put your pretty graphs in this section. :-) Include a summary/analysis section where you explain the data.
Second section was an overview of any new research groups that had started using the cluster. In the annual report this section covered all the research groups that had used the cluster. Here is where you want to make the case that the cluster is an important part of your organization's research infrastructure. Include budget/grant amount for the new groups. List the PI's and their CVs as well as any new applications that you are supporting.
Third section was a summary of any cluster administration issues. Include outages (past and future), hardware/software installs and updates and any other issues.
Finally, the last section covered future growth. I included any meetings or presentations we had done, potentially new research groups we were talking to, and any new hardware or software we were procuring.
The quarterly and annual reports were basically concatenations of the monthlies (literally cut-n-paste). The annual report also tended to include other things like budgets, but that was a separate process.
As others have mentioned, the contents of the report really depend on your organization and what you are trying to show as well as your target audience. In my case, the reader list for the monthlies was fairly limited, with the quarterly and annual reports being more widely distributed. So the former tended to be short and terse while the latter were more detailed and complete.
Last piece of advice - for the raw data, script as much as you can. You'll be doing this often so it's worth the investment to automate. Also do not do like I often did and leave all this until the last few days before it is due. Collect the information throughout the month and then it's just a matter of an afternoon's worth of editing rather than scrambling around to get, for example, all the PIs CV and grant information at the last moment. (1/2 :-)
More information about the Beowulf