[Beowulf] Software RAID?
ekechi at alexa.com
Tue Nov 27 15:28:45 PST 2007
> -----Original Message-----
> From: Mark Hahn [mailto:hahn at mcmaster.ca]
> Sent: Monday, November 26, 2007 8:45 PM
> To: Ekechi Nwokah
> Cc: Beowulf Mailing List
> Subject: RE: [Beowulf] Software RAID?
> >> Of course there are a zillion things you didn't mention. How many
> >> drives did you want to use? What kind? (SAS? SATA?) If
> you want 16
> >> drives often you get hardware RAID hardware even if you
> don't use it.
> >> What config did you want?
> >> Raid-0? 1? 5? 6? Filesystem?
> > So let's say it's 16. But in theory it could be as high as
> 192. Using
> 16 drives in a system is reasonable. for much larger
> systems, I would go for more scalable building blocks
> (network connected, 10GE or IB.)
> > multiple JBOD cards that present the drives individually
> (as separate
> > LUNs, for lack of a better term), and use software RAID to
> do all the
> > things that a 3ware/Areca, etc. card would do across the
> total span of
> > drives:
> > RAID 0/1/5/6, etc., hotswap, SAS/SATA capability, etc.
> software raid on 192 drives in a single system sounds
> somewhat dubious.
> not impossible, by any means, but you'll wind up challenging
> the fact that commodity systems don't scale that high. I'm
> guessing you'll need to use a bunch of SAS channels for
Something like that ;-).
> > Right now, all the hardware cards start to precipitously drop in
> > performance under concurrent access, particularly read/write mixes.
> I would consider turning off queueing in the card, and
> leaving all the request sorting to the kernel. any ideal how
> many requests a card typically takes upon itself to schedule
> (afaik, disks themselves only ever sort fairly small
> numbers.) since you depend strongly on very effective
> sorting of requests, I'd lean towards trying to get the host
> to do all the sorting.
Agreed. In that case you could buy much cheaper JBOD cards without any
queueing at all. That's the point.
> > I just haven't seen something like that and I was not aware that md
> > could acheive anything close to the performance of a hardware RAID
> > card across a reasonable number of drives (12+), let alone
> provide the
> > feature set.
> 12 disks is enough for MD to need some help - tweaking the
> stripe cache, for instance.
> can I ask what the actual application is?
Large scale data mining.
> regards, mark hahn.
More information about the Beowulf