[Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mark Hahn hahn at mcmaster.caWed Sep 9 22:11:38 PDT 2009
- Previous message: [Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
- Next message: [Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> Our new cluster aims to have around 300 compute nodes. I was wondering > what is the largest setup people have tested NFS with? Any tips or well, 300 is no problem at all. though if you're talking to a single Gb-connected server, you can't home for much BW per node... > comments? There seems no way for me to say if it will scale well or > not. it's not to hard to figure out some order-of-magnitude bandwidth requirements. how many nodes need access to a single namespace at once? do jobs drop checkpoints of a known size periodically? faster/more ports on a single NFS server gets you fairly far (hundreds of MB/s), but you can also agregate across multiple NFS servers (if you don't need all the IO in a single directory...) > I have been warned of performance hits but how bad will they be? NFS is fine at hundreds of nodes. nodes can generate a fairly high load of, for instance, getattr calls, but that can be mitigated some with an acregmin setting. > Infiniband is touted as a solution but the economics don't work out. depends on how much bandwidth you need... > Assume each of my compute nodes have gigabit ethernet AND I specify > the switch such that it can handle full line capacity on all ports. but why? your fileservers won't support saturating all nodes links at once, so why a full-bandwidth fabric? the fabric backbone only needs to match the capacity of the storage (I'd guess 10G would be reasonable, unless you really ramp up the number of fat fileservers.) or do you mean the fabric is full-bandwidth to optimally support MPI? > If not NFS then Lustre etc options do exist. But the more I read about yes - I wouldn't resort to Lustre until it was clear that NFS wouldn't do. Lustre does a great job of scaling content bandwidth and capacity all within a single namespace. but NFS, even several instances, is indeed a lot simpler...
- Previous message: [Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
- Next message: [Beowulf] how large of an installation have people used NFS with? would 300 mounts kill performance?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
