[Beowulf] Building new cluster - estimate
landman at scalableinformatics.com
Tue Aug 5 19:43:47 PDT 2008
As a note: I was pointed to a recent lockup (double lock acquisition)
in XFS with NFS. I don't think I have seen this one in the wild myself.
Right now I am fighting an NFS over RDMA crash in 2.6.26 which seems
to have been cured in 126.96.36.199 . .2 is almost out, so will test with
that as well.
This said, our experience with xfs has been quite good (performance,
reliability, etc). Some vendors kernels (2.6.18 ahem!) have some issues
with xfs (and a bunch of other things), so we usually update them anyway.
Matt Lawrence wrote:
> On Wed, 6 Aug 2008, Chris Samuel wrote:
>> I suspect you're not doing a lot of disk I/O, we
>> found NFS servers using ext3 as a back end would
>> crumble under the weight of lots of writes as ext3
>> is single threaded through the journal daemon.
>> That means that you end up with all your NFS daemons
>> blocking on that, stalling everything else. :-(
> Could be. Given the long and sordid history of NFS, I prefer to not use
> it whenever there are practical alternatives. I'm also not a Solaris
> fanboy. So, different mindset that a lot of unix sysadmins.
>> There have been occasional bugs in XFS in older kernel
>> releases, but then there have been bugs in other filesystems
> That could be it, he does spend a fair amount of time cleaning up
> systems that others have built.
>> Never had that problem here.
>> Does he know that he can use xfs_fsr to defragment
>> XFS filesystems online ?
> He certainly does. He was talking about using OpenNMS to determine the
> best time to run it. He had lots of good things to say about how easy
> it is to track through performance data with it.
>> Is he sure he's not hitting another kernel bug ?
> It wouldn't surprise me.
> This is someone who I trust enough that if he warns me of something, I
> make a real effort to doublecheck if it is currently a problem. It
> doesn't mean he is always right, just that I think the research effort
> is a really good idea.
> -- Matt
> It's not what I know that counts.
> It's what I can remember in time to use.
> Beowulf mailing list, Beowulf at beowulf.org
> To change your subscription (digest mode or unsubscribe) visit
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
More information about the Beowulf