[Beowulf] Building new cluster - estimate

Eric Thibodeau kyron at neuralbs.com
Wed Aug 6 18:07:01 PDT 2008


Matt Lawrence wrote:
> On Mon, 4 Aug 2008, Joe Landman wrote:
>
>> This mirrors our experience, though RHEL stability under intense 
>> loads is questionable IMO (talking about the kernel BTW).  We find 
>> that the missing drivers, the omitted drivers, the backported drivers 
>> along with some odd and often useless "features" (4k stacks anyone?) 
>> render the RHEL default kernels (and by definition the Centos 
>> kernels) less useful for HPC and storage tasks than what we build.  
>> Our current standard is a 2.6.23.14 kernel which is rock solid under 
>> load.  Working on a 2.6.26 based version now (even though I am on 
>> vacation/holiday, I just updated it to 2.6.26.1 to address an 
>> observed crashing issue with the RDMA server)
>
> Since I plan to continue running CentOS, it sounds like building a 
> much later kernel rpm is the way I want to approach the problem.  Will 
> going to a much later kernel break any of the utilities?  Other 
> problems I can expect to see?
>
> What do you recommend for the kernel config?
>
>> Combine this with the small upper limit of ext3 partition sizes, the 
>> file size limits in ext3, the serialization in the journaling code 
>> (ext4 is extents based to help deal with this), ext3 just doesn't 
>> make much sense in a storage/HPC system (apart from possibly 
>> boot/root file system where performance is less critical).  Yeah I 
>> have seen studies from folks whom had done 1E6 removes, file creates, 
>> and other things who claim xfs is slower than ext3.  Yeah, those are 
>> bad benchmarks in that they really don't touch on real end user use 
>> cases for the most part (apart from possible large scale mail servers 
>> and other things like that).
>
> I have never had any problems with ext3.  I had dinner with a friend 
> who is an expert Linux sysadmin who was warning me to stay away from 
> xfs.  He cited lots of fragmentation problems that routinely locked up 
> his systems. I am willing to be convinced otherwise, but he is a very 
> sharp fellow.
Check the kernel mailing list for XFS problems with RAID5 if you use 
mdadm...jsut a gentle suggestion ;)
>
> -- Matt
> It's not what I know that counts.
> It's what I can remember in time to use.
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit 
> http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list