[Beowulf] anyone using SALT on your clusters?

Joe Landman landman at scalableinformatics.com
Tue Jul 2 07:54:14 PDT 2013


On 07/02/2013 10:28 AM, Mark Hahn wrote:

[...]

> in other words, to probe for whether a node can execute an op, you really
> have to try to execute the op.  which means you WILL STILL have to deal with
> semi-byzantine failures through timeouts, etc.

Precisely.  Its almost an expression of something akin to an uncertainty 
principle, you *only* know its alive at a certain point in time, which 
may not correspond to the administrative point in time.  This 
uncertainty could represent something quite painful, or it could be 
innocuous ... say a network delay of some sort.

The point being that pub-sub isn't specifically any better (or worse) 
than push-based (ssh).  It alters the failure modes from timeout to 
non-pull of messages, and if you think about this, this is *still* a 
timeout.  The question is, what have you gained in the process?

This is all about ROI.  Changing things for the sake of changing things 
(say to use a particular language) is a waste of time in most cases. 
Changing things because the change provides you concrete benefits in 
your operations, though with associated costs, makes sense when the 
benefits outweigh the costs.  You can use short term or long term 
optimization, or combinations.

ssh is quite good at what it does.  Pdsh wraps around ssh (or libssh), 
and builds upon this.  As does xssh and many other ssh variants (I wrote 
something called all.pl that did this many moons ago, prior to seeing 
pdsh in action).

Changing away from ssh and its derivatives make sense only if there is 
significant enough value for this.  Looking at the issues with salt, I 
just don't see it.

One argument which is easy to make for salt, which I didn't see anyone 
make is, it lets you lower your risk by removing the ssh daemon.  But 
this is a low risk, as apart from Ubuntu a few years ago, no one 
generally ships an "at risk" ssh daemon, or key generation system.

>
> which is why I use ssh to mass-admin.  let's be honest, handling timeouts
> is not magic.  ssh is also nicely decoupled, and has excellent ways to
> robustly express asymmetric trust, etc.

+10

>
> also, to me, integration is the devil's playground.  it's easy to pitch
> that integration will make life easier, but except in fairly specific
> conditions, it also leads to tighter coupling, fragility, inflexibility.

Tight coupling is very good for some things, but in tools, it can lead 
to very hard to work around bugs.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics, Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/siflash
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615


More information about the Beowulf mailing list