[Beowulf] RE: Storage - the end of RAID?

Ellis H. Wilson III ellis at runnersroll.com
Fri Oct 29 11:46:35 PDT 2010


On 10/29/10 14:06, Lux, Jim (337C) wrote:
>> -----Original Message-----
>> From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org] On Behalf Of Hearns, John
>> Sent: Friday, October 29, 2010 9:43 AM
>> To: beowulf at beowulf.org
>> Subject: [Beowulf] Storage - the end of RAID?
>>
>> Quite a perceptive article on ZDnet
>>
>> http://www.zdnet.com/blog/storage/the-end-of-raid/1154?tag=nl.e539
>>
>> Class, discuss.
>>
>
> Yes, indeed, his comments makes sense..
>
> After all, the acronym was "Redundant Arrays of Inexpensive Disks"
>
> Granted, these implementations had useful side effects (e.g. improving read speed by sharing)
>
> The real question is whether drive reliability has improved commensurate with the drive capacity (that is, is the failure rate per drive basically constant, as opposed to the "bit error rate")
>
> RAID was designed to solve the "failed drive" problem, more than the "bad bit" problem.  And to do it using a less than "rate 1/2" code.. that is, rather than store 2 copies of your data, you could store, essentially, 11/8ths copies of your data (using a Hamming code to generate 3 syndrome bits for each 8 data bits for instance), thereby saving money.
>
> However, if drives get cheap, then using 2 copies (or 3) isn't a big deal.

Drives (of the commodity variety) are pretty darn cheap already.  I'd be 
surprised if this (RAID 1) isn't the better solution today (rather than 
RAID2-6), rather than some point in the future.

The major issue I see with the article is that the author refers to RAID 
being "dead" when really he should be saying RAID 2-6 is less preferable 
to RAID 1 (but it does make for a "catchier" article title).  RAID 0 
will always be around to soften the bottleneck created by the gap in 
performance between CPU and disk.  I would actually be surprised if it 
wasn't common in big HPC in five years to have cpu nodes talking to I/O 
forwarding nodes that had RAID1 caches of SSDs in them who in turn 
talked to Server nodes connected directly to LUNs (who also have RAID, 
although I cannot say whether it would be 1/10/01/etc).  This setup 
lessens the need for tons of expensive RAM at the client or forwarding 
nodes since SSD is closer to CPU speed than disk in terms of latency for 
reads and fixes some of the canonical "durability" problems in HPC.

Also, I think he would be hard-pressed to make a case against varieties 
of hybrid RAID which use 0 and 1.  In those situations on failure you 
are basically performing a straightforward copy - and it can happen 
from/to multiple disks at once.  Slight performance degradation, but 
nothing as serious as parity-based rebuilds.

I personally do not see certain versions of RAID going away anytime soon 
- they are just too basic a concept for performance/redundancy to kill 
them off.

ellis



More information about the Beowulf mailing list