[Beowulf] best archetecture / tradeoffs
dgs at gs.washington.edu
Tue Aug 30 16:56:32 PDT 2005
> > I'd heard people say that was a problem, but haven't found it so. what files
> > are inadequately cached by NFS and wind up causing noticable traffic?
> I've not done a scientific study of this, but I did use "lsof" to find various
> files that were actively causing NFS traffic when I used to do NFS-root.
> Not large data traffic, but more status/attribute traffic. The bulk data would
> get cached, but NFS would keep checking if a file/directory had changed
> status. At least that was what I remember from several years ago.
> Maybe NFS-root has become less chatty since 2001 or so.
> How many nodes can you safely handle with NFS-root? I use a single
> Warewulf master on 192 nodes (KLAT2 + KASY0), and I know there are
> bigger installations. The design goal is to scale to a thousand+ nodes
> from a single boot/master machine. My NFS-root experience stopped
> at 64 nodes, the original configuration of KLAT2, and it worked fine
> as far as the load on the boot/master at that time, but it seemed to
> be close to the fraility limits of circa 2001 NFS.
A clever way I've seen of doing NFS-root uses AFS instead, with a
replicated root volume mounted read-only on the clients. AFS' local
caching makes it fairly efficient, and replicating the volume means
the server is no longer a single point of failure. As long as one
server hosting the volume is available, the clients will keep running.
I've considered using AFS in place of NFS in clusters, but have been
been discouraged by the diffculty of managing tokens and Kerberos
tickets has discouraged. AFS-root is pretty slick, though.
> > it seems like starting a new job would read little more than the user's
> > shell, some shared libraries, /etc/passwd and friends. I haven't tried to
> > collect traces, but they seem quite NFS-caching-friendly...
> Yes, the data side of those files is NFS-caching-friendly. But because
> NFS has no true coherency protocol, each node must periodically
> check on the attributes of each open NFS file looking for possible
> changes (even for readonly files). And, with all the important shared
> libraries already on the ramdisk, job startup in Warewulf can be very
> fast. If you find there is some rarely used, but large, shared library
> installed in your VNFS, you can add it to the excludes file, and it would
> be obtained over NFS rather then from the ramdisk (when in hybrid mode).
More information about the Beowulf