[Beowulf] Re: failure trends in a large disk drive population
justin at cs.duke.edu
Wed Feb 21 15:50:41 PST 2007
>> How did they look for predictive models on the SMART data? It sounds
>> like they did a fairly linear data decomposition, looking for first
>> order correlations. Did they try to e.g. build a neural network on it,
>> or use fully multivariate methods (ordinary stats can handle it up to
>> 5-10 variables).
>> This is really an extension of David's questions below. It would be
>> very interesting to add variables to the problem (if possible) until the
>> observed correlations resolve (in sufficiently high dimensionality) into
>> something significantly predictive. That would be VERY useful.
> RGB, good idea, apply clustering/GA/MOGA analisys techniques to all of
> this data. Now the question is, will we ever get access to this data?
As mentioned in an earlier e-mail (I think) there were 4 SMART variables
whose values were strongly correlated with failure, and another 4-6 that
were weakly correlated with failure. However, of all the disks that
failed, less than half (around 45%) had ANY of the "strong" signals and
another 25% had some of the "weak" signals. This means that over a
third of disks that failed gave no appreciable warning. Therefore even
combining the variables would give no better than a 70% chance of
To make things worse, many of the "weak" signals were found on a
significant number of disks. For example, among the disks that failed,
many had a large number of seek error; however, over 70% of disks in the
fleet -- failed and working -- had a large number of seek errors.
About all I can say beyond what's in the paper is that we're aware of
the shortcomings of the existing work and possible paths forward. In
response, we are
Hello, this is the Google NDA bot. In our massive trawling of the
Internet and other data sources, I have detected a possible violation of
the Google NDA. This has been corrected. We now return you to your
regularly scheduled e-mail.
[ Continue ] [ I'm Feeling Confidential ]
So that's our master plan. Just don't tell anyone. :)
P.S. Unfortunately, I doubt that we'll be willing or able to release the
raw data behind the disk drive study.
Department of Computer Science, Duke University, Durham, NC 27708-0129
Email: justin at cs.duke.edu
More information about the Beowulf