[Beowulf] where to start building ur own cluster distro

Tim Cutts tjrc at sanger.ac.uk
Wed Apr 16 09:18:36 PDT 2008

On 16 Apr 2008, at 4:11 pm, stephen mulcahy wrote:
> Jon Aquilina wrote:
>> desktop interface at least for the master node. then the others can  
>> pxe boot from the master. i know this becomes problematic the large  
>> the cluster. for now i think starting like this would be good to  
>> have a distro for people such as myself who are quite new to  
>> clustering.
> If you're committed to rolling your own distro and joining the likes  
> of those listed at http://lwn.net/Distributions/ then I'd recommend  
> Debian as a good starting point.
> But I'd echo Jakob's comments - I don't see much value in building a  
> "cluster distribution". It would be far more valuable to document  
> how to install a standard distro on a cluster - either your KUbuntu,  
> standard Ubuntu or any distro of your choice (I keep meaning to do  
> this myself for work we've done with Debian in the past but the time  
> to do so keeps getting away from me).

I agree entirely.  Rather than doing the full-blown custom  
distribution thing, which is a huge amount of effort, what is somewhat  
easier is just to maintain your own local package repository for  
things which you want to maintain separately from the distribution.   
You can then use the distribution's normal tools to keep everything up  
to date, and you decide which bits you want to maintain for yourself,  
and which you want to leave to the upstream distro (as much as  
possible, in my opinion).  The way we do it here is as follows:

1)  The cluster's OS is plain ol' Debian.
2)  We have a standalone server which mirrors ftp.uk.debian.org
3)  On that same server we run the package "debarchiver" which is a  
pretty painless way of building your own debian package repository.
4)  The /etc/apt/sources.list file on our cluster nodes contains three  
entries; the local mirror, security.debian.org, and our debarchiver  
5)  If I want a package that deviates from the normal Debian one (say,  
I want to backport something newer from lenny, or a custom package of  
local software) I can use standard Debian tools to build it (dpkg- 
buildpackage) and upload it into our local repository (e.g. dput  
sanger-etch some-fancy-thing.changes )
6)  The final piece of the puzzle is that we use cfengine to make sure  
the right packages are installed on the right machines at the right  
versions, and to maintain the configuration files for everything,  
including apt.

Of course, part of the reason all this works so well for us is that  
three members of the team are full Debian Developers, so it was a way  
of working we were already used to.

The nice thing is that in fact, we don't just use this infrastructure  
for the cluster, but actually use it for every single Debian system in  
the Institute, be it a cluster node, a desktop, a standalone server or  
a high availability failover cluster.


 The Wellcome Trust Sanger Institute is operated by Genome Research 
 Limited, a charity registered in England with number 1021457 and a 
 company registered in England with number 2742969, whose registered 
 office is 215 Euston Road, London, NW1 2BE. 

More information about the Beowulf mailing list