[Beowulf] Big storage

Ekechi Nwokah ekechi at alexa.com
Thu Sep 13 14:10:04 PDT 2007

DDN is really good at detecting and correcting data errors. I think
their controllers do a parity check on every read; not sure where in the
pipeline it occurs, off the top of my head. But it's one of the main
reasons they are doing so well at the national labs. 

How does the ZFS checksumming work? Checksum of what?

-- Ekechi

-----Original Message-----
From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
On Behalf Of Leif Nixon
Sent: Thursday, September 13, 2007 1:20 PM
To: beowulf at beowulf.org
Subject: Re: [Beowulf] Big storage

Loic Tortay <tortay at cc.in2p3.fr> writes:

> According to Bruce Allen:
>> This thread has been evolving, but I'd like to push it back a bit.
>> Earlier in the thread you pointed out the CERN study on silent data
>> corruption:
>> http://fuji.web.cern.ch/fuji/talk/2007/kelemen-2007-C5-Silent_Corrupt
>> ions.pdf
> Actually, I was not the one who pointed out this study but I can't 
> remember who did.

That was me, actually. We both saw the presentation at the last HEPiX
meeting, though. (We have already established we were there. 8^) )

> We are not using fsprobe on our X4500.
> There are two reasons:
>  . ZFS has built-in error detection (through "zpool scrub") and we are
>    (maybe naively) relying on this to detect and correct data
>    which would be otherwise silent;

It *would* be interesting to see if the ZFS checksumming lives up to its

>  . due to some ZFS limitation (there are some :-) fsprobe does not
>    work reliably with ZFS.
> I'll try to be as concise as possible on the last point.
> In order to make sure that data are actually written to/read from disk

> and not from cache, fsprobe (optionally) uses Direct I/O (buffer cache

> bypass).
> Since Direct I/O is not supported by ZFS, you can't actually be 
> certain that you're reading from disk and not from the cache (although

> you can get "some" guarantee that you actually write to the disk using

> "data synchronous" writes -- aka O_DSYNC or the "fsync()" family of 
> POSIX functions).

I still think it would be interesting to see how often one gets data
corruption from other sources than disk errors (presuming ZFS is
perfect). Data corruption is data corruption even if its from bad cache

I will try to get fsprobe deployed on as much of the Nordic LHC storage
as possible.

Leif Nixon                       -            Systems expert
National Supercomputer Centre    -      Linkoping University
Beowulf mailing list, Beowulf at beowulf.org To change your subscription
(digest mode or unsubscribe) visit

More information about the Beowulf mailing list