[Beowulf] cluster deployment and config management

Stu Midgley sdm900 at gmail.com
Mon Sep 4 23:43:56 PDT 2017


Interesting.  Ansible has come up a few times.

Our largest cluster is 2000 KNL nodes and we are looking towards 10k... so
it needs to scale well :)

On Tue, Sep 5, 2017 at 1:46 PM, Lachlan Musicman <datakid at gmail.com> wrote:

> On 5 September 2017 at 15:24, Stu Midgley <sdm900 at gmail.com> wrote:
>
>> Morning everyone
>>
>> I am in the process of redeveloping our cluster deployment and config
>> management environment and wondered what others are doing?
>>
>> First, everything we currently have is basically home-grown.
>>
>> Our cluster deployment is a system that I've developed over the years and
>> is pretty simple - if you know BASH and how pxe booting works.  It has
>> everything from setting the correct parameters in the bios, zfs ram disks
>> for the OS, lustre for state files (usually in /var) - all in the initrd.
>>
>> We use it to boot cluster nodes, lustre servers, misc servers and
>> desktops.
>>
>> We basically treat everything like a cluster.
>>
>> However... we do have a proliferation of images... and all need to be
>> kept up-to-date and managed.  Most of the changes from one image to the
>> next are config files.
>>
>> We don't have a good config management (which might, hopefully, reduce
>> the number of images we need).  We tried puppet, but it seems everyone
>> hates it.  Its too complicated?  Not the right tool?
>>
>> I was thinking of using git for config files, dumping a list of rpm's,
>> dumping the active services from systemd and somehow munging all that
>> together in the initrd.  ie. git checkout the server to get config files
>> and systemctl enable/start the appropriate services etc.
>>
>> It started to get complicated.
>>
>> Any feedback/experiences appreciated.  What works well?  What doesn't?
>>
>
>
> We are a small installation, with manageable needs. In our first step up
> from where you are, we ended up on:
>
> - Katello/Foreman (in RedHat it's called Satellite) for management of
> software repositories, in discrete sets and slices. We started with
> Spacewalk but it is a little old and fusty and just isn't appropriate
> anymore.
> - git for config management of environment module files
> - Ansible for easy day to day management of servers
>
> We no longer manage configs as such, since there is a shared data store,
> and the Ansible/Katello mix means we can rebuild any server from scratch.
>
> Note that Ansible and Katello/Foreman can be integrated - we haven't gone
> that far yet. Are quite happy with the two being apart. That will change in
> the near future I think.
>
> Cheers
> L.
>
>
>
>
> ------
> "The antidote to apocalypticism is *apocalyptic civics*. Apocalyptic
> civics is the insistence that we cannot ignore the truth, nor should we
> panic about it. It is a shared consciousness that our institutions have
> failed and our ecosystem is collapsing, yet we are still here — and we are
> creative agents who can shape our destinies. Apocalyptic civics is the
> conviction that the only way out is through, and the only way through is
> together. "
>
> *Greg Bloom* @greggish https://twitter.com/greggish/
> status/873177525903609857
>
>
>


-- 
Dr Stuart Midgley
sdm900 at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20170905/2d42e3e1/attachment-0001.html>


More information about the Beowulf mailing list