[Beowulf] Re: how large of an installation have people used NFS with? would 300 mounts kill performance?

Thu Sep 10 08:44:41 PDT 2009

On Wed, Sep 9, 2009 at 3:38 PM, Greg Keller <Greg at keller.net> wrote:
>>
> "It all depends" -- Anonymous Cluster expert

Thanks Greg. And I hate that anonymous expert. He's the bane of my
current existence. I even get nightmares with his ghastly face in
them. :)

> I routinely run NFS with 300+ nodes, but "it all depends" on the
> applications' IO profiles.

50% projected runtime  is with an application with negligible reads
and writes (VASP). The other 50% goes to an app. (DACAPO) which strace
shows to be using 10% of its runtime devoted to I/O.  Mostly seeks.
More reads than writes. Multiple small reads and writes. All cores
doing I/O not a central master core.

>For example, Lot's of nodes reading and writing
> different files in a generically staggered fashion,

How do you enforce the staggering? Do people write staggered I/O codes
themselves? Or can on alliviate this problem by scheduler settings?

> Luster or eventually pNFS if things get ugly.  But not all NFS servers are
> created equal, and a solid purpose built appliance may handle loads a
> general purpose linux NFS server won't.

Disk array connected to generic Linux server? Or standalone
Fileserver? Reccomendations?

What exactly does a "solid purpose built appliance" offer that a
Generic Linux server (well configured) connected to an array of disks
does not offer?

> The bottleneck is more likely the File-server's Nic and/or it's Back-end
> storage performance.  If the file-server is 1GbE attached then having a
> strong network won't help NFS all that much.  10GbE attached will keep up
> with a fair number of raided disks on the back-end.  Load the NFS server up
> with a lot of RAM and you could keep a lot of nodes happy if they are
> reading a common set of files in parallel.

Yup; I'm going for at least 24 GB RAM and twin 10 GigE cards
connecting the file server to the switch.

-- 
Rahul