[Beowulf] SATA II - PXE+NFS - diskless compute nodes

Mark Hahn hahn at physics.mcmaster.ca
Sat Dec 9 14:39:25 PST 2006


>> particular lightweight
>> compute node model, (PXE booting into RAM) and so  does not run into the
>> typical
>> nfs-root scalability issues.

I'm not sure I know what those would be.  do you mean that the kernel code
for nfs-root has inappropriate timeouts or lacked effective retries?

> At what  node count does the nfs-root model start to break down?  Does anyone
> have any rough numbers with the number of clients you can support with a generic
> linux NFS server vs a dedicated NAS filer?

I think the answer depends mostly on your config.  for instance, if you
have a typical distro's incredibly baroque /etc/rc.d tree, then you'll
be generating tons of traffic even though NFS caches quite well.
but for HPC clustering, most of that is completely spurious - often
a clusters nodes are all identical, so no extensive configurability 
is necessary in modules, daemons, etc.

if there are scalability issues, they depend on saturating your NFS server
with traffic, but you control that amount.  on a somewhat neglected 
cluster I have, kernel+initrd amount to 3779277 bytes, which seems quite 
high.  but probably limits the cluster to 10-ish nodes/second booting
(it has 100 nodes, but I've never timed the boot).  once a node has the 
kernel+initrd, it reads some other files via NFS, but nothing much
(syslog binary+config, same for portmap, and sshd).  to me, the tradeoff
is transmitting via TFTP vs NFS.  I would strongly suspect that the latter
is more efficient and robust, so would prefer to minimize the kernel+initrd
size.

for what it's worth, I tcpdumped a node booting just now:
11428451 bytes in 14236 packets (40.8 seconds). 
that's a 2.6 kernel, myrinet support, syslog, ssh, 
queuing system written in perl, and home and scratch mounts.
with some effort, that could probably be 5MB or so.
it's also clear that separate servers could handle subsets of the 
traffic in a large cluster (separate dhcp/tftp/syslog, separate 
servers for nfs root vs other)

regards, mark hahn.



More information about the Beowulf mailing list