[Beowulf] anyone using SALT on your clusters?

Jonathan Barber jonathan.barber at gmail.com
Mon Jul 1 06:09:29 PDT 2013


On 29 June 2013 06:07, Christopher Samuel <samuel at unimelb.edu.au> wrote:

> On 28/06/13 18:45, Jonathan Barber wrote:
>
> > The problem with SSH based approaches is when you have failed nodes
> > - normally they cause the entire command to hang until the attempted
> >  connection times out.
>
> xdsh in xCAT can handle that for you, passing the -v option tells it to
> use the nodes status as monitored to avoid down nodes.
>

That's interesting, I hadn't noticed that option before.

Looking at what it does, the argument causes xcat to run "nmap -PE" (i.e.
does an ICMP echo request to the host) before connecting. So it will also
hang if the sshd blocks for some reason (such as with my past NFS woes).

Cheers


> I might suggest to them an environment variable to enable that by
> default, rather than having to remember to add it.
>
> --
>   Christopher Samuel        Senior Systems Administrator
>   VLSCI - Victorian Life Sciences Computation Initiative
>   Email: samuel at unimelb.edu.au Phone: +61 (0)3 903 55545
>   http://www.vlsci.org.au/      http://twitter.com/vlsci
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> http://www.beowulf.org/mailman/listinfo/beowulf
>



-- 
Jonathan Barber <jonathan.barber at gmail.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20130701/145bee64/attachment.html>


More information about the Beowulf mailing list