[Beowulf] Configuration management tools/strategy

Wed Jan 9 12:46:17 PST 2013

On 01/06/2013 05:38 AM, Walid wrote:
> Dear All,
> 
> At work we are starting to evaluate Configuration management to be used
> to manage several diverse hpc clusters

We currently managing 15 clusters with puppet and am very pleased with
puppet.  Puppet is one of the critical pieces that allows us to manage
15 clusters with 2 people.  It works well enough that most of the staff
time goes to improving the environment.

A few particularly useful features (above and beyond the standard
configuration management):
1) virtual users and tagging are awesome for allowing arbitrary sets
   of users to access arbitrary subsets of your servers/clusters.
   "User <| tag == FooResearchGroup |>" allows selecting all the users
   with that tag.
2) It integrated with cobbler well, allowing puppet to take control
   *BEFORE* the first reboot
3) it's relatively distro agnostic, making it easy for us to support
   linux clusters based on an arbitrary linux distribution.

So basically take your favorite linux distribution, add cobbler (or
similar), puppet, your favorite batch queue and environmental modules
and you have most of the pieces to build a HPC cluster.

I would however suggest keeping any critical files (especially
/etc/puppet) in version control.  Puppet can be a bit cantankerous at
times and it can be very valuable to be able to revert to the last know
working state.