redhat and pxe
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Donald Becker becker at scyld.comWed Jan 29 11:49:29 PST 2003
- Previous message: redhat and pxe
- Next message: Fwd: FreeBSD port of Gridengine
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, 29 Jan 2003, Mark Hahn wrote: > anyway, the thing I observed is that my tftp server winds up receiving > a corrupted filename. if dhcpd.conf says > option bootfile-name "/pxelinux.0"; > then the tftp server winds up receiving "/pxelinux.0\xff". (which looks > like a y-umlaut iirc.) I think I tcpdumped both dialogs, and concluded > that the fault is in the bios. Hmmm, I would blame the DHCP server, or perhaps the "proxy PXE" (sic) server. While the bug is in the client machine, this is a well-known issue that has a trivial work-around. Any PXE server that doesn't implement it has obviously not been tested in real life! The TFTP protocol uses null-terminated file names, while the bootp/DHCP options are length-specified strings without a trailing null. Put this together with the end-of-options value of 255 (0xff), and you get the result you observe. The work-around is to put a '0' option following the boot file name, which results in a boot file name that is both correctly length-specified and null-terminated. We have recently done work implementing the whole protocol suite (bootp/DHCP, TFTP and PXE servers) along with implementing the clients. Why the clients? We needed to simulate large-scale simultaneous boots. I'm pretty much convinced that most other current implementations are lacking or flawed. A cluster needs PXE boot services that are protocol-correct reliable with hundreds of simultaneous clients can relate configuration errors to their source PXE was obviously developed around a hacked-up DHCP server, with the intent to re-use existing servers. But the details of the specification don't isolate the functionality. The result is that the only correct way to implement it is with unified servers must be built together > I've also only been able to get the nodes > to boot off their builtin eepro100 interface, rather than the e1000 > (also-builtin - these are tyan s2720 nodes). This is very common right now. The typical server-class board now has a 10/100 and a gigabit interface. Network booting, Wake-On-LAN and IPMI management are implemented only on the 10/100 interface. But most users want to operate using the gigabit interface. Thus we have had to split our boot services from the operational services, and go through a second discovery phase. Despite the node having its IP configuration passed from the boot client, it must discard that info to figure out which interface it should operate on. -- Donald Becker becker at scyld.com Scyld Computing Corporation http://www.scyld.com 410 Severn Ave. Suite 210 Scyld Beowulf cluster system Annapolis MD 21403 410-990-9993
- Previous message: redhat and pxe
- Next message: Fwd: FreeBSD port of Gridengine
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
