[Beowulf] SATA II - PXE+NFS - diskless compute nodes

Simon Kelley simon at thekelleys.org.uk
Tue Dec 12 09:46:44 PST 2006


Joe Landman wrote:
>>> I would hazard that any DHCP/PXE type install server would struggle
>>> with 2000 requests (yes- you arrange the power switching and/or
>>> reboots to stagger at N second intervals).

> fwiw:  we use dnsmasq to serve dhcp and handle pxe booting.  It does a
> marvelous job of both, and is far easier to configure (e.g. it is less
> fussy) than dhcpd.

Joe, you might like to know that the next release of dnsmasq includes a
TFTP server so that it can do the whole job. The process model for the
TFTP implementation should be well suited to booting many nodes at once
because it multiplexes all the connections on the same process. My guess
 is that will work better then having inetd fork 2000 copies of tftpd,
which is what would happen with traditional TFTP servers.

If anyone on the list has a suitable test setup, I be very happy to do
some pre-release load testing.

For ultimate scalability, I guess the solution is to use multicast-TFTP.
I know that support for that is included in the PXE spec, but I've never
tried to implement it. Based on prior experience of PXE ROMs, the chance
of finding a sufficiently bug-free implementation of mtftp there must be
fairly low.

>> There are a few modifications you have to make to increase the number
>> of bootps before it fails.

> Likely with dhcpd, not sure how many dnsmasq can handle, but we have
> done 36 at a time to do system checking.  No problems with it.

Dnsmasq will handle DHCP for thousands of clients on reasonably meaty
hardware. The only rate-limiting step is a 2-3 second timeout while
newly-allocated addresses are "ping"ed to check that they are not in
use. That check is optional, and skipped automatically under heavy load,
so a large number of clients is no problem.


Cheers,

Simon.






More information about the Beowulf mailing list