[Beowulf] RAID storage: Vendor vs. parts

Paul Nowoczynski pauln at psc.edu
Tue Feb 22 23:15:48 PST 2005


Alvin Oga wrote:

>hi ya steve
>
>On Tue, 22 Feb 2005, Steve Cousins wrote:
>
>  
>
>>That's what I'm shooting for. Anybody have good luck with volumes greater
>>than 2 TB with Linux?  I think LSI SCSI cards are needed (?) and the 2.6
>>Kernel is needed with CONFIG_LBD=y.  Any hints or notes about doing this
>>would be greatly appreciated.  Google has not been much of a friend on
>>this unfortunatlely. I'm guessing I'd run into NFS limits too.
>>    
>>
>
>for files/volumes over 2TB ... it's a question of libs, apps and kernel 
>	everything has to work ... which is not always the case
>
>  
>
We've got this working at PSC without too much pain.. even with scsi 
block devices >2TB.  The  LBD is needed but it
doesn't solve all the problems with large disks, especially if you have 
a single volume which is larger than
2TB.  The issue we ran into was that many disk related apps like mdadm 
and [s]fdisk don't support
the BLKGETSIZE64 ioctl.  So even though your kernel is using 64 bits, 
some needed apps are not. 
There are also issues with disklabels for devices >2TB.  The normal 
dos-style disklabel used by linux
doesn't support them so you'll need a kernel patch for the "plaintext" 
partition table made by Andries Brouwer.
If you're interested in running this on 2.6 I can give you the patch.   
As far as cards go I think the adaptec u320 cards
are better.  I've seen less scsi timeout weirdness with them (this could 
be related to our disks).  Performance wise
the lsi and adaptec are about the same.. we see ~400MB/sec when using 
both channels - even with a sub pci-x bus. 
For a couple hundred bucks a card this is really good news. 

--paul

>	i don't play much with 2.6 kernels other than on suse-9.x boxes
>  
>  
>
>>Also, am I being overly cautious about having a spare RAID controller on
>>hand?  How frequent do RAID controllers go bad compared to disks, power
>>supplies and fan modules?  I'd guess that it would be very infrequent.
>>    
>>
>
>it's always better to have spare parts ... ( part of my requirement ) if
>they expect the systems to be available 24x7 ... 
>
>	- more importantly, how long can they wait, when silly inexpensive
>	things die, before it gets replaced
>
>	- dead fans is $2.oo - $15 each to keep the disks cool
>
>	- power supply is $50 range ... but if one bought n+1 powersupply
>	than its supposed to not be an issue anymore, but you will need to
>	have its replacement handy
>
>	- raid controllers should NOT die, nor cpu, mem, mb, nic, etc
>	and it's not cheap to have these items floating around as spare
>	parts
>
>	- ethernet cables will go funky if random people have access
>	to the patch panels ... ( keep the fingers away )
>
>	- ups will go bonkers too
>
>	- what failure mode can one protect against and what will happen
>	if "it" dies 
>
>	- best protection against downtime for users is to have an
>	warm-swap server which is updated a hourly or daily ... 
>	( my preference - 2nd identical or bigger-disk capacity system )
>
>  
>
>>Looking back at my own experience I think I've had to return one out of 15
>>in the last eight years, and that was bad as soon as I bought it.
>>    
>>
>
>seems too high of a return rate ?? 1 out of 15 ??
>
>  
>
>>If this is too off-topic let me know and I'll move it elsewhere.
>>    
>>
>
>ditto here 
>
>24x7x365 uptime compute environment is fun/frustrating stuff on tight
>budgets
>
>c ya
>alvin
>
>_______________________________________________
>Beowulf mailing list, Beowulf at beowulf.org
>To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
>  
>




More information about the Beowulf mailing list