<div dir="ltr">The <i>annum mirabilis</i> "exceeds expectations", that's beautiful :-)<div>Peter</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Nov 26, 2013 at 9:15 AM, Lux, Jim (337C) <span dir="ltr"><<a href="mailto:james.p.lux@jpl.nasa.gov" target="_blank">james.p.lux@jpl.nasa.gov</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5"><br>

On 11/25/13 4:11 PM, "Adam DeConinck" <<a href="mailto:ajdecon@ajdecon.org">ajdecon@ajdecon.org</a>> wrote:<br>

<br>

>-----BEGIN PGP SIGNED MESSAGE-----<br>

>Hash: SHA512<br>

><br>

>> 4. I went to a BoF on ROI on HPC investment. All the presentations in<br>

>> the BoF frustrated me. Not because they were poorly done, but because<br>

>> they tried to measure the value of a cluster by number of papers<br>

>> published that used that HPC resource. I think that's a crappy, crappy<br>

>> metric, but haven't been able to come up with a better one myself yet.<br>

>>I<br>

>> was very vocal with my comments and criticisms of the presentations, so<br>

>> if any of the presenters are reading this now, I apologize for<br>

>> hi-jacking your BoF. Getting good ROI on a cluster is close to my<br>

>>heart,<br>

>> but is also difficult to quantify and measure. I hope I can be part of<br>

>> the discussion next year.<br>

>><br>

><br>

>Do you have any thoughts you can share on what alternative metrics<br>

>might look like, even if you can't think of one that's clearly better?<br>

><br>

>I have no horse in this race as I've been doing industry HPC for the<br>

>past few years, but I'm curious what good metrics for ROI on an academic<br>

>or lab cluster might be. Total number of papers? Number of<br>

>citations after an N-year time window? [shrug]<br>

><br>

>ROI measurement can sometimes be difficult even in an industrial or<br>

>commercial setting, especially if the HPC resource is used for R&D or<br>

>"engineering support" as opposed to something that feeds directly into<br>

>the product.<br>

><br>

>Cheers,<br>

>Adam<br>

<br>

</div></div>Definitely a challenge.<br>

<br>

Maybe we have webcams that look at all the users and we calculate<br>

percentage of time smiling while interacting with the cluster?<br>

<br>

ROI for "technology development" is a tough thing to calculate.  ROI, by<br>

it's nature is a "money returned for money spent", and the return is<br>

somewhat intangible.<br>

<br>

All of these metrics require having a baseline so you can do a<br>

before/after comparison.  And realistically, there needs to be a fairly<br>

long averaging time on the metric.  Here's the annual paper output of a<br>

noted physicist.<br>

<br>

1901  1<br>

1902  2<br>

1903  1<br>

1904  1<br>

1905 25<br>

1906  6<br>

1907  8<br>

1908  4<br>

1909  5<br>

1910  6<br>

1911  8<br>

<br>

How would you evaluate the ROI of feeding him?  Started kind of slow, had<br>

a really good year, and likely received a "exceeds expectations" annual<br>

review. But that set a new bar, and now his supervisor is going to be<br>

hammering him.. Dude, your output is slacking off, I think we need to put<br>

you on a performance improvement plan, and this year, you're going to be<br>

"does not meet" in your review.<br>

<br>

<br>

The other problem is that paper publishing (and the schedule thereof) is<br>

influenced by things other than availability of computational resources,<br>

so you need a very large sample so those influences average out.  For<br>

instance, lack of funds or permission to travel to a conference, combined<br>

with the recent fad of "you must present in person" will have an effect.<br>

The sequester and/or furlough will almost certainly manifest itself in any<br>

sort of time series counting publications.<br>

<br>

The other thing is that there is a long gestation period for some work.<br>

You might not have something "publishable", especially with the bias<br>

against publishing null or negative results. That doesn't mean that the<br>

HPC work wasn't useful, if it found a bunch of "ways not to go".<br>

<br>

There might also be a availability of workforce to grind out the papers<br>

issue.  At least at JPL, relatively few people work on a single job or<br>

task.  A more typical scenario is having 2 or 3 projects you work on<br>

simultaneously, along with half a dozen things you support. There is a<br>

tendency to spend one's time on the latest thing to go wrong, and in a "do<br>

more with less" environment, there's not a lot of down time in which to<br>

catch up.<br>

<br>

In an environment where short term results are more important (or, at<br>

least have more "gain" in the control loop) it's tough to push "getting<br>

published" higher up the priority list, since the personal ill effects of<br>

not publishing may be years down the road, compared to immediate ill<br>

effects of "the project will be cancelled if we don't make the deliverable<br>

this month".<br>

<br>

At JPL, it is easy to tell in which organizations, the "papers published"<br>

metric is important in annual ranking and review: they're the ones with<br>

lots of papers.  That's not to say that other organizations don't do lots<br>

of publishable work, but if your annual review depends on something<br>

*other* than the metric, you're not going to spend your time doing it.<br>

<br>

Export controls also rear their ugly head.  A lot of interesting problems<br>

that can be attacked by HPC are the practical ones. But in a number of<br>

industries, once you move beyond pure research and theory (TRL 3), you get<br>

into an area where it is either competition sensitive or export<br>

controlled.    It is true that you could write a paper that is suitably<br>

expurgated and sanitized, but you still have to go through the export<br>

control/public release review process. And that's time consuming too.<br>

<br>

<br>

The whole competition sensitive/proprietary rights/ export controls aspect<br>

might be why some of you have commented on the gulf between what's<br>

presented on the show floor and what's presented in the talks.<br>

<div class="HOEnZb"><div class="h5"><br>

_______________________________________________<br>

Beowulf mailing list, <a href="mailto:Beowulf@beowulf.org">Beowulf@beowulf.org</a> sponsored by Penguin Computing<br>

To change your subscription (digest mode or unsubscribe) visit <a href="http://www.beowulf.org/mailman/listinfo/beowulf" target="_blank">http://www.beowulf.org/mailman/listinfo/beowulf</a><br>

</div></div></blockquote></div><br></div>