[Beowulf] Software Raid

Vincent Diepeveen diep at xs4all.nl
Tue Dec 13 18:46:43 PST 2005


At 14:54 13-12-2005 -0800, Michael Will wrote:
>IMHO software raid outperforms hardware raid
>mostly in pure disk-benchmark situations, not so much when
>other applications run as well. 

Sure, that's why many nerds started to use it a few years ago.

>Maybe NFS fileservers are spezialized enough that software
>raid performance is not as much off in production as it is
>in benchmarking.
>
>The remaining advantage of hardware is still hot-swapping
>failed drives without having to shutdown the server.

Those same nerds of above, they do not take into account that if 
something complex like a raid array gets suddenly handled in 
software instead of hardware, that even the tiniest 
undiscovered bug in a file system, will impact you.

Not everyone is using raid arrays, so not all situations have been 
debugged yet. And the support for this in software is not that new.
Just a few years old. Just fixing tcp/ip protocol took what was 
it 20 years in unix or so?

And be sure that there is bugs. So doing a hardware XOR (or whatever) in
RAM of the raid controller instead of in the software, is a huge advantage.

It reduces complexity of what software has to do, so it reduces the
chance that a bug will occur in the OS somewhere, causing you to lose
all your files.

Let's face it, much code written for linux is using a coding style 
that i used when i was 16 years old. It takes care that finding bugs
is extremely difficult in that spaghetti-code.

Sometimes they therefore never get found.

So the real advantage of the persons doing their raid5 in hardware
instead of software here, is that those nerds of above, all of them
whom i knew personaly, they all lost all data at their entire raid
arrays.

In this case, there were 3 persons storing 900+ GB of data who crashed. 
We just started a p2p project to recover world wide what has been left 
of that 1045GB of EGTB data.

the p2p project can be found at :

   http://kd.lab.nig.ac.jp/chess/tablebases-online/

It gets led by Kirill Kryukov

At this moment we collected about 500GB of it from all kind of different
sources. 

>Michael 
>
>-----Original Message-----
>From: beowulf-bounces at beowulf.org [mailto:beowulf-bounces at beowulf.org]
>On Behalf Of Michael T. Prinkey
>Sent: Tuesday, December 13, 2005 1:17 PM
>To: Paul
>Cc: beowulf at beowulf.org
>Subject: Re: [Beowulf] Software Raid
>
>
>I can tell you from long experience that this is not true.  I have had
>software raid/NFS/SMB servers on our clients' LANs serving up terabytes
>of home directories to tens of workstations and hundreds of compute
>nodes.  
>Our experience is that (at least 3ware) hareware raid is significantly
>slow than software raid when using the same hardware.  In fact, the
>speedup in using the 8-port 3ware SATA drivers in JOBD mode with
>Software
>RAID5 was about six times faster than using hardware RAID5 on the same
>controller.
>
>The CPU needed to do the parity computation is easily supplied by a
>single processor system.  Most of our servers our not SMP unless they
>are also asked to do something else, like act as the head node to a
>cluster.
>
>There are different arguments about whether or not Linux NFS is ready to
>serve large numbers of simulateous hosts to which I am not able to
>speak, but for 10-100 users with 100s of CPUs, software RAID/NFS/SMB
>under Linux seems to work just fine.
>
>Mike Prinkey
>
>On Mon, 12 Dec 2005, Paul wrote:
>
>> I read in a post somewhere that it was not possible to use a Linux 
>> software RAID configuration for shared file storage in a cluster. I 
>> know that it is possible to use software RAID on individual compute 
>> nodes but the post stated that software RAID would not support 
>> properly support simultaneous accesses on a file server. Is this true?
>> 
>> Assuming that hardware RAID is required (or at least preferable) I was
>
>> wondering if the built in RAID on some motherboards would be adequate 
>> or do we need to look into a dedicated piece of hardware. We will have
>
>> about 10 - 12 cpus initially that will be connected with giganet 
>> network. We currently have about a terrabyte of storage space and are 
>> planning to mount it using NFS in a RAID 5 configuration. Our 
>> applications for now will be database intensive bioinformatics apps. I
>
>> would be very interested in any comments. Thanks
>> 
>> Paul Mc Kenna
>> _______________________________________________
>> Beowulf mailing list, Beowulf at beowulf.org To change your subscription 
>> (digest mode or unsubscribe) visit 
>> http://www.beowulf.org/mailman/listinfo/beowulf
>> 
>> 
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org To change your subscription
>(digest mode or unsubscribe) visit
>http://www.beowulf.org/mailman/listinfo/beowulf
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf
>
>



More information about the Beowulf mailing list