[Beowulf] Which distro for the cluster?

Thu Dec 28 09:24:47 PST 2006

On Wed, 27 Dec 2006, Chetoo Valux wrote:

> Dear all,
>
> As a Linux user I've worked with several distros as RedHat, SuSE, Debian and
> derivatives, and recently Gentoo.
>
> Now I face the challenge of building a HPC for scientific calculations, and
> I wonder which distro would suit me best. As a Gentoo user, I've recognised
> the power of customisation, optimisation and lightweight system, for
> instance my 4 years old laptop flies like a youngster, and some desktops
> too. So I thought about building the HPC nodes (8+1 master) with Gentoo ....
>
> But then it comes the administration and maintenance burden, which for me it
> should be the less, since my main task here is research ... so browsing the
> net I found Rocks Linux with plenty of clustering docs and administration
> tools & guidelines. I feel this should be the choice in my case, even if I
> sacrifice some computation efficiency.
>
> Any advice on this will be appreciated.

Sigh.  I already wrote up a nice offline reply to this once today; let
me do a shorter version with commentary for you as well.

I'd be interested in comments to the contrary, but I suspect that Gentoo
is pretty close to the worst possible choice for a cluster base.  Maybe
slackware is worse, I don't know.

I personally would suggest that you go with one of the mainstream,
reasonably well supported, package based distributions.  Centos, FC, RH,
SuSE, Debian/Ubuntu.  I myself favor RH derived, rpm-based,
yum-supported distros that can be installed by PXE/DHCP, kickstart, yum
from a repository server.  Installation of such a cluster on diskful
systems proceeds as follows:

   a) Set up a mirror (probably a tertiary mirror) of e.g. FC6 or Centos
4.  Choose the former if you want to run really current hardware and
like really up to date libraries, choose a more conservative distro like
Centos or RHEL if you want longer term support and less volatility.  If
Centos supports your hardware platforms it is a fine choice, but it
often won't run on very new chipsets and CPUs.

   b) Set up a DHCP/PXE server -- dhcpd, tftpd, etc -- to enable diskless
boot of the standard installation images for the distro of your choice.

   c) Develop a node kickstart file, and hotwire it into your
installation image.  This can be done per node architecture, or can be
done once and for all and then customized per node architecture with
smart runtime scripts (that's how Duke tends to do it, but then folks
here invented the scripts).  Set up dhcp/pxe so that a network boot
option triggers the appropriate kickstart install.  The %post script
should do all end-stage configuration.

   d) Boot each node once with KV to set the bios to boot from the
network, and record the MAC address of the boot interface(s).  Enter
these into your dhcp file and other /etc places so that your nodes each
have a unique boot identity and name (like b01, b02, b03...).

   e) Boot each node a second time, and either select the kickstart
install from the netboot options or boot the node with a toggle script
or a BIOS boot order that will network boot the install the first time
and thereafter boot it normally from disk unless you REQUEST a
reinstall.  Once the node is installed initiating a reinstall can be
done via grub, for example, without any need for a KV hookup to the
node.

Voila!  Instant cluster.  a-c are done once, d and e are the only steps
that require per-node action on your part.  If you have a friendly
vendor you may even be able to skip d if they are willing to preset the
BIOS according to your specs and label the network ports with the
connected MAC address.  In that case you install the cluster by editing
in the cluster's MAC address (on the server) and turning it the node on.
Can't get MUCH easier than that, although there are some toolsets out
there that will just glomph the MAC address from the initial netboot and
do the tablework for you -- which may or may not work for you depending
on your network and how much control you want over the node name and
persistence of the node identity.

Thereafter yum maintains the nodes automatically from the repo mirror --
you shouldn't have to "touch" the nodes as unique objects to be
administered except when they break at the hardware level, ever again.

An alternative for diskless nodes or nodes that don't actually boot from
their local disks but use them instead for scratch space is to use
warewulf with the distro of your choice.  You still have to do a lot of
the above, but warewulf "helps" you set up PXE and tftpd and so on, and
lets you use a single image per node architecture.  You still use yum or
apt to update those images (with a bit more chroot-y work to accomplish
it) but the core warewulf package helps you maintain node identity from
the single image and diskless nodes are arguably more stable -- disk is
one of the top two or three sources of hardware failure.

   HTH,

     rgb

>
> Chetoo.
>

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu