Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Surviving a double disk failure

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Joe Landman landman at scalableinformatics.com
Sat Apr 11 19:38:16 PDT 2009


Stuart Midgley wrote:
> Thanks to all the responses, it has been interesting reading.  We have 
> started using raid6 on newer servers and will slowely get rid of our old 
> raid5 servers.
> 
> I found the comments about scrubbing very interesting.  What do people 
> do with their file systems?  We couldn't afford the reduced performance 

Software RAIDs (our DeltaV) are scrubbed once a week.  Hardware raids 
are scrubbed also once a week.  Basically errors can accumulate. 
Scrubbing isn't perfect, and as Michael and others have pointed out, 
there can be bugs.  But honestly, I am of the opinion that the several 
hours of scrubbing which results in reduced performance, are a heck of a 
lot better than dealing with down time due to an "event".

Scrubbing occurs in the background, and you can limit its impact.

> and time for scrubbing.  We run our Lustre setup almost flat out all the 
> time.  We regularly do over a PB of io in a week (we often have our 
> total throughput at ~3GB/s for weeks on end).  We use lustre as our 
> scratch space so backups are not possible.  Nothing could get the data 
> off fast enough between us creating/using/deleting it.
> 
> Of course, the fact that we basically run at 95% full all the time is as 
> good as scrubbing :)

Not quite ...  Scrubbing is a bit more of a structured testing and 
repair.  The I/O may leave coverage holes ... even at 95% capacity.



-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
        http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list