[Beowulf] Software RAID?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Joe Landman landman at scalableinformatics.comThu Nov 22 06:16:52 PST 2007
- Previous message: [Beowulf] Software RAID?
- Next message: [Beowulf] Software RAID?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Vincent Diepeveen wrote: > Wait a minute do i read that correctly, > > "you have to rescan the scsi bus". > > In short, first you spend really a lot of money to get SCSI > drives in order to then get confronted with all the software raid issues. No. SATA presents itself as SCSI in Linux. [...] > IMHO the interesting issue with raid is how to get a raid system where > you can hotswap and which supports both raid10 as well as different > types of raid5 (with one extra spare). RAID10 is not that hard in software, the hotswap is harder. Cold-swap *may* be possible (last I looked it should work, haven't tried it recently on SATA). The issue is whether or not the driver drags the kernel screaming and kicking into a kernel panic when a device is removed ... FWIW, I plugged in a laptop SATA drive into one of our Pegasus boxen (many core workstation, think baby cluster), and without a reboot, or any interaction on my part, it found the drive, and mounted the file systems. It was scary, but I did wat to see what happens. It worked w/o kernel panic. The interrupt issue is painful, but the more cores you have, the more pain you can stand there. The CSW is harder. They shouldn't be, but they *seem* to be a point of serialization in the kernel. This is annoying ... high CSW turn a very fast machine to a whimpering mass quite quickly. > Speaking of that, how to save power with raid when it's not currently > streaming. is there hardware cards that let the drives idle when the raid > array hardly gets used for i/o? MAID or idle spin down. Our units do this. > I'm about to investigate how to cheap build a huge raid array (with > hotswap) for private purposes (chess EGTB generation and i guess i'll > require a TB or 4+ for that and raid10 as the write load iys also very > high during generation). 1TB is not huge. 2 x 1TB disks in a RAID1. 4TB is not huge. If the data is important, RAID6 with hot spares. More expensive and a bit slower on RW, but faster on rebuilds is a RAID10. Either way you can do this in 7-9 drives easily. With the right motherboard (Supermicro variant comes to mind), you can have 6 SATA and 8 SAS (remember SAS does talk to / connect to SATA drives) you can up to 14 devices attached to the MB. Couple this with a deskside/rack mount type case to handle this many, and you should be fine. > > What solutions are there? > > Vincent > > On Wed, 21 Nov 2007, Joe Landman wrote: > >> Ekechi Nwokah wrote: >>> Hi, >>> >>> Does anyone know of any software RAID solutions that come close to the >>> performance of a commodity RAID card such as LSI/3ware/Areca for >>> direct-attached drives? >> For small numbers of drives, yes, the MD driver is superb with two >> (well, really three) caveats. >> >> First: No hot swap. You can do a kind-of-cold swap (have to take the >> mount offline, and can execute a few MD raw-disassemble, and then turn >> the device off, swap, force Linux to rescan the scsi bus, mark the drive >> as a hot spare, and force reassembly ... then remount). This may or may >> not work, depending upon the linux driver for the SATA port. Some get >> very unhappy if the drive goes away after it found it. >> >> Second (and third): Context switches (and interrupts) tend to quickly >> swamp even fast systems with lots of processors. This is because the >> SATA drivers on Linux, while good for basic SATA operations, may have a >> few issues with multiple CSW needed for each transfer. You can drive a >> fast system to become slow with a simple RAID0 across two drives. Run >> bonnie++ on it (not IOzone, unless you want to measure memory cache). >> Now imagine that system serving NFS requests. Additionally, the >> interrupts driven by these hard IO operations also often drive the >> system performance into the ground. We see 15-20k CSW and 20+k >> interrupts under heavy load for a simple two drive RAID0 serving NFS >> over gigabit. >> >> That is, it is not a bad idea, and it is possible to do it. But be >> aware that you are going to need a fairly beefy machine (lots of RAM, >> lots of cores) to handle the buffering and the interrupts. Can't help >> much on the CSW's, you will just have to pay that price. >> >>> With the availability multi-core chips and SSE instruction sets, it >>> would seem to me that this is doable. Would be nice to not have to pay >>> for those RAID cards if I don't have to. Just wondering if anything >>> already exists. >> The extra you pay for those RAID cards buys you hot swap, and if you >> choose carefully, reasonable RAID engines. They aren't perfect, their >> small random IO performance on large files leaves something to be >> desired (as do all RAID controllers from what I can see, unless you want >> to buy Bluearc or other units) >> >> If you do choose to go the MD route, check out which SATA drivers are >> well performing (low CSW/interrupts), and focus upon them. There are a >> few out there. >> >> Joe >> >>> Thanks, >>> Ekechi >>> >>> _______________________________________________ >>> Beowulf mailing list, Beowulf at beowulf.org >>> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> >> -- >> Joseph Landman, Ph.D >> Founder and CEO >> Scalable Informatics LLC, >> email: landman at scalableinformatics.com >> web : http://www.scalableinformatics.com >> http://jackrabbit.scalableinformatics.com >> phone: +1 734 786 8423 >> fax : +1 866 888 3112 >> cell : +1 734 612 4615 >> _______________________________________________ >> Beowulf mailing list, Beowulf at beowulf.org >> To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf >> -- Joseph Landman, Ph.D Founder and CEO Scalable Informatics LLC, email: landman at scalableinformatics.com web : http://www.scalableinformatics.com http://jackrabbit.scalableinformatics.com phone: +1 734 786 8423 fax : +1 866 888 3112 cell : +1 734 612 4615
- Previous message: [Beowulf] Software RAID?
- Next message: [Beowulf] Software RAID?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
