[Beowulf] NFSv3 client hangs - tcp v/s udp.

Robert Millner rmillner at fiendly.org
Mon May 8 11:19:05 PDT 2006


Were there any console messages on your NFS server?

A problem I ran into last year on a 384 node cluster was that Linux has
a hard coded limit of 20 TCP connections per nfsd.  This is a bare
constant buried in the kernel server code.

The console message on the server was:
nfsd: too many open TCP sockets, consider increasing the number of nfsd
threads

On any specific nfsd, connection #21 would cause connection #1 to drop.
The client from #1 would then make the new #21 causing someone else to
drop.

If I recall properly (and may not so check this yourself) any specific
mount resulted in a TCP connection.  Size the number of nfsd processes
to be around (nodes * mounts)/20.  I haven't really had problems with
UDP on a low latency, well behaved network.

Anyway, not sure if this is germane to your actual problem but its one
to be aware of.

	Cheers,
	Rob





More information about the Beowulf mailing list