revamping our beowulf
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Robert G. Brown rgb at phy.duke.eduFri Sep 6 06:19:04 PDT 2002
- Previous message: revamping our beowulf
- Next message: revamping our beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, 5 Sep 2002, Tintin J Marapao wrote: > Hi All, > > In the lab I work in, we have an 10 node suse linux cluster. its about 2-3 > years old and has started to act really really funky. We are planning to > overhaul the whole thing, replacing the hard drive of the world node and > installing a newer version of suse 8.0. Does anyone have any tips before I > start thrashing it (aside from crossing my fingers?) > I am actually more concerned about how I can go about cloning the nodes > efficiently...with the least amount of anxiety Convert to Red Hat, learn to use kickstart, PXE, DHCP and yum. With kickstart, PXE, DHCP and yum, one can develop a node "image" (kickstart recipe) and boot the nodes via PXE/DHCP, kickstart install them to an identical configuration, and maintain them transparently with yum. If your nodes are too old for PXE support in the BIOS, you can accomplish the same thing with a suitable boot floppy and no PXE. Boot from (standard RH netboot) floppy, entering "ks" at the boot prompt. Node gets identity and directions to KS file and install sources from DHCP, installs itself, reboots into production with a terminating "reboot" command in %post. This approach has many lovely things about it. All nodes are identical. All nodes are automagically maintained to REMAIN identical. All nodes can be upgraded or reinstalled in about 30 minutes of your time from a standing start at any time you wish. If the nodes support PXE, you don't even have to be there and the time required might be as little as ten or fifteen minutes. I put on dog-and-pony shows with our cluster from time to time, and one amusing trick is to put an install boot floppy hacked to make ks the timeout default (a bit of a pain, involving mkinitrd and so forth, but not horribly difficult) into the drive of an idle node, punch reset, wait until the floppy stops spinning and remove it (all the while talking about how easy it is to install and maintain a cluster, wave one's arms a bit and talk about the importance of scalability in cluster administration, and have the node finish its reinstallation back to EXACTLY the same state it started from just as you finish the spiel (a few minutes depending on how loaded the install server is). Some friends of mine on this list with Gbit ethernet and fast servers have installed order of 60 nodes in about 10 minutes, without even using PXE. rgb > > Any input is welcome :) > > Thanks, > > Tintin > > _______________________________________________ > Beowulf mailing list, Beowulf at beowulf.org > To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > -- Robert G. Brown http://www.phy.duke.edu/~rgb/ Duke University Dept. of Physics, Box 90305 Durham, N.C. 27708-0305 Phone: 1-919-660-2567 Fax: 919-660-2525 email:rgb at phy.duke.edu
- Previous message: revamping our beowulf
- Next message: revamping our beowulf
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
