[Beowulf] anyone using SALT on your clusters?

Joe Landman landman at scalableinformatics.com
Fri Jun 28 13:56:46 PDT 2013


On 06/28/2013 04:45 AM, Jonathan Barber wrote:
> On 27 June 2013 23:53, Joe Landman <landman at scalableinformatics.com
> <mailto:landman at scalableinformatics.com>> wrote:
>
>     On 06/25/2013 04:55 PM, Paul English wrote:
>
>     [...]
>
>      >
>      > For shuffling data around and the equivalent of what Salt's original
>      > purpose in life was, we use prsync and pssh. They work very well.
>     We can
>
>     To this day we still prefer pdsh from LANL.  Its (IMO) the best of the
>     lot, and we leverage it extensively in tiburon.
>
>
>      > use them with canned lists of hosts (generated by cf-engine for this
>      > purpose eg: all hosts of type X), with ssh keys etc. I suspect
>     they are
>      > slower and and perhaps less "something" (scalable perhaps? we are a
>      > relatively small site in HPC terms) than Salt's ZMQ based
>     approach. But
>      > they've been good enough for what we've done so far.
>
>
> The problem with SSH based approaches is when you have failed nodes -
> normally they cause the entire command to hang until the attempted
> connection times out.

This isn't an issue for things like pdsh. Also theres a nifty utility 
called whatsup that handles all this for you.  It makes determining what 
is up, well, fairly painless.

>
> The message based systems typically don't have this problem because they
> use a pub-sub model where the clients subscribe to hear the commands
> from the server. If the client is down, the server doesn't wait on them
>

This isn't so much of an issue.

> [snip]
>
>
>     Honestly configuration management is largely a moot point for
>     image/remote boot, and an annoying necessity for local boot/management.
>
>
> I would like to point out that configuration management isn't something
> you only need to use if you have webscale sites. IMHO it's useful
> whenever you need to manage some configuration - including the
> configuration of binary images. How do you manage the creation of that
> image if not with some kind of configuration management tool (even if
> your "tool" is a set of shell scripts)?

I think you are conflating too many things here.

1) image management:  This is for OS config as you are talking about 
above.  Configuration management is usually associated with this to some 
degree, this is where yum and many other tools come into play at a low 
level.

2) configuration management atop a base image.  Many distros try to mix 
these together, and often do a terrible job of it (rpm.saves anyone?).

>
> The configuration problem is independent of the system deployment mechanism.
>
> [snip]
>
>      > I would suggest some caution when approaching Salt - which we did
>     when
>      > we were considering what to do after chef. While Salt seems to be an
>      > exceptional approach to "do a bunch of things on a bunch of
>     hosts," AND
>      > it is in python (win!), it does seem that the configuration
>     management
>
>     ... some of us don't quite see language of implementation as a win or
>     loss, with a notable exception (java)
>
>      > part is an add-on and/or afterthought. Yes - configuration management
>      > does involve lots of doing lots of things on lots of hosts. But
>      > cf-engine is now in it's third iteration of "what does that _mean_ in
>      > real terms - with tons of different configuration 'languages', files,
>      > daemons, restart services etc.. even only on Linux?"
>
>     A big chunk of what you write about are best handled by a monitoring
>     system as compared to a configuration management system.
>
>
> Yes, monitoring is not the same as configuration.

I think you may have not grasped what I wrote.

>
>
>     If you could completely eliminate the "install OS, run configure scripts
>     on it" section of startup, would you?  This isn't a sales pitch, its a
>     genuine question.
>
>
> I don't understand your question, how can you eliminate configuration?
> At some point you have to tell the system what it's supposed to do.

Its done once.  Then you don't have to install it again.  Its installed. 
  Its done.


-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/siflash
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615



More information about the Beowulf mailing list