[Beowulf] Big storage

Jeffrey B. Layton laytonjb at charter.net
Mon Aug 27 14:13:46 PDT 2007

Andrew Piskorski wrote:
> On Mon, Aug 27, 2007 at 05:25:51AM -0500, Bruce Allen wrote:
>> I did read Garth's comments. I believe that there are two types of 
>> possible problems:
>> (1) A sector or handful of sectors on a disk become unreadable
>> (2) An entire disk fails (all sectors become unreadable)
>> Problems of type (1) can be handled well by high quality raid 
>> implementations.
> I believe Garth's whole point is that your assumption above is often
> NOT true.  He also seemed to imply that this is a function of the
> ineraction between the block-level RAID implementation and the file
> system, as his Panasas file system reputedly fixes this scary, "one
> small unrecoverable read during array rebuild kills your entire disk
> volume" failure mode.

I don't know about "fix" :)  But this is whole object based storage
idea. With object based storage, a failure of a single block does fail
the whole volume. A bad block will just cause the file that used that
block to be marked as bad. Then you restore the particular file from
backup. It's much easier to restore a file instead of a whole volume.
So you escape failing the whole volume, but you still fail a file if
this happens.

> However, I do not really understand exactly which systems are subject
> to this risk of catastrophic failure and which are not, nor why.  If
> anyone has pointers to a more complete explanation, please do chime
> in...

Send Garth some email and see if he has any presentations he can
send you :)  His email address at CMU is on his web page.

But in general, any block based RAID is subject to this risk. In
some cases the risk can be very low. In other cases, it can be very
high. RAID-6 can help, but you pay a price in performance and
capacity. But as disks get bigger and people put more and more of
them together in a single group, then your probability of having
problems goes up.


P.S. I hope I'm explaining things well enough. Garth is the man on this
subject, but he doesn't read this mailing list.

More information about the Beowulf mailing list