[Beowulf] Re: how large of an installation have people used NFS with? would 300 mounts kill performance?
Greg at keller.net
Thu Sep 10 11:13:02 PDT 2009
On Sep 10, 2009, at 10:44 AM, Rahul Nabar wrote:
> On Wed, Sep 9, 2009 at 3:38 PM, Greg Keller <Greg at keller.net> wrote:
>> For example, Lot's of nodes reading and writing
>> different files in a generically staggered fashion,
> How do you enforce the staggering? Do people write staggered I/O codes
> themselves? Or can on alliviate this problem by scheduler settings?
Although there's probably a way to enforce it at the app level, or
scheduler, all of that would require specific knowledge of what jobs
(and nodes) are accessing what files how at what time. I was thinking
that if it's largely embarassingly parallel jobs that start/stop
independently and have somewhat randomized IO, then there is some
natural staggering. If the app starts on all nodes simultanously and
then they all start reading/writing the same files nearly
simultaneously, then staggering is probably impossible and a parallel
FS is worth investigating.
>> Luster or eventually pNFS if things get ugly. But not all NFS
>> servers are
>> created equal, and a solid purpose built appliance may handle loads a
>> general purpose linux NFS server won't.
> Disk array connected to generic Linux server? Or standalone
> Fileserver? Reccomendations?
> What exactly does a "solid purpose built appliance" offer that a
> Generic Linux server (well configured) connected to an array of disks
> does not offer?
Joe's post is spot on here. Don't let legend and lore scare you off,
NFS can do great things on current generic and special purpose servers
with the right config and software. There's nothing in your
configuration and usage summary that screams NFS killer to me. If you
use generic or special purpose servers, you can repurpose them as part
of a parallel FS if you need to.
Purpose built *appliances* generally give you:
Simple setup and admin GUI
Replication and other fancy features HPCC doesn't normally care about
Zero flexibility if you change course and head towards a parallel FS.
A singular support channel to complain to if things go badly (YMMV)
None of those matter to me more than the money they cost, so I buy
standard servers and run standard linux NFS on internal raid
controllers with no HA, and have occasional crashes and issues I can't
resolve cleanly. We are perpetually looking for a "next step" to get
better support/stability, but it's good enough for our 300 and 600
node systems at the moment.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf