Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

Node cloning

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Felix Rauch rauch at inf.ethz.ch
Fri Apr 6 00:39:59 PDT 2001


On Thu, 5 Apr 2001, Robert G. Brown wrote:
[Copying /dev/hda to /dev/hd?]
> One of many possible problems, actually.  This approach to cloning
> makes me shudder -- things like the devices in /dev generally have to
> built, not copied, there are issues with the boot blocks and bad block
> lists and the bad blocks themselves on both target and host.  raw
> devices are dangerous things to use as if they were flatfiles.

Unfortunately I'm not an expert in disk technology, so I might be
wrong here... but I thought that the bad block lists were maintained
by the disks themselves and not transparent to the OS.

In any case: We did not have any instability issues due to cloning in
the last few years.

[...]
> One reason I gave up cloning (after investing many months writing a
> first generation cloning tool for nodes (which booted a diskless
> configuration, formatted a local disk, and cloned itself onto the local
> disk) and started a second generation GUI-driven one) was that just
> cloning isn't enough.  There is all sorts of stuff that needs to be done
> to the clones to give them a unique identity (even something as simple
> as their own ssh keys), one needs to rerun lilo, it requires that you
> keep one "pristine" host to use as the master to clone or you have the
> very host configuration creep you set out to avoid.  Either way you end
> up inevitably having to upgrade all the nodes or install security or
> functionality updates.

Let me just add a few insights from our years of experience here:
- We use DHCP to assign (fixed) IP addresses to nodes. The only
  problem here is to get the list of all MAC addresses in the first
  place.
- We use the same SSH hostkey for all nodes in our cluster (not for
  the server and our personal workstations though).
- When we clone whole disks or whole partitions, we don't need to run
  lilo, fdisk or whatever. The disks are identical after the clone,
  including partition tables and boot sectors.
- An additinal boot script called "personalize" personalizes the
  machines during the first boot-up. Based on the hostname the script
  mounts additional external disk drives, configures additional
  network interfaces etc.

To conclude: If we want to update our cluster, then we update a master
machine, boot all machines in a small maintenance Linux with PXE, run
Dolly on all machines to clone them, reboot, done. There are no
post-cloning operations required, but as usual, YMMV.
  
Of course there might be better ways to install your cluster,
depending on your needs, configuration, experience, etc. For
(mostly) homogenous mid-sized clusters (we have 16--24 nodes in our
clusters), cloning works well.

- Felix
-- 
Felix Rauch                      | Email: rauch at inf.ethz.ch
Institute for Computer Systems   | Homepage: http://www.cs.inf.ethz.ch/~rauch/
ETH Zentrum / RZ H18             | Phone: ++41 1 632 7489
CH - 8092 Zuerich / Switzerland  | Fax:   ++41 1 632 1307





More information about the Beowulf mailing list