[Beowulf] HD undetectable errors

Fri Aug 21 09:33:17 PDT 2009

On Friday 21 August 2009, Henning Fehrmann wrote:
> Hello,
>
> a typical rate for data not recovered in a read operation on a HD is
> 1 per 10^15 bit reads.

I think Seagate claims 10^15 _sectors_ but I may have mis-read it.

> If one fills a 100TByte file server the probability of loosing data
> is of the order of 1.
> Off course, one could circumvent this problem by using RAID5 or RAID6.
> Most of the controller do not check the parity if they read data and
> here the trouble begins.

Peter Kelemen at CERN has done some interesting things, like:
 http://cern.ch/Peter.Kelemen/talk/2007/kelemen-2007-C5-Silent_Corruptions.pdf

> I can't recall the rate for undetectable errors but this might be few
> orders of magnitude smaller than 1 per 10^15 bit reads. However, given
> the fact that one deals nowadays with few hundred TBytes of data this
> might happen from time to time without being realized.
>
> One could lower the rate by forcing the RAID controller to check the
> parity information in a read process. Are there RAID controller which
> are able to perform this?

Yes. But most wont and it will hur quite a lot performance wise. I know, for 
example, that our IBM DS4700 with updated firmware can 
enable "verify-on-read".

> Another solution might be the useage of file systems which have additional
> checksums for the blocks like zfs or qfs. This even prevents data
> corruption due to undetected bit flips on the bus or the RAID
> controller.

This is, IMHO, probably a better approach. As you can read in the article I 
referenced above the controller layer (and really any layer) adds yet another 
source of silent corruptions so the higher up the better.

Also, the file system has information about the data layout that it can use to 
do this more efficiently than lower layers.

> Does somebody know the size of the checksum and the rate of undetected
> errors for qfs?

Remember that you have to calculate this against the amount of corrupt data, 
not the total amount of data.

/Peter

> For zfs it is 256 bit per 512Byte data.
> One option is the fletcher2 algorithm to compute the checksum.
> Does somebody know the rate of undetectable bit flips for such a
> setting?
>
> Are there any other file systems doing block-wise checksumming?
>
>
> Thank you,
> Henning Fehrmann
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part.
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20090821/bd35c1d4/attachment.sig>