[Beowulf] Re: Rackable / SGI
tjrc at sanger.ac.uk
Sun Apr 5 14:00:44 PDT 2009
On 5 Apr 2009, at 4:00 pm, Jason Riedy wrote:
>> A similar situation exists in the node management space, where
>> existing solutions like CFengine were pretty much ignored by HPC
> Ha! Cfengine was pretty much ignored by *everyone*, including
> its author for quite some time. Promising (pun intended) the
> next great advance and not passing current maintenance to others
> loses users quickly.
Count one cfengine user here.
> Also, cfengine is (or was, when last I used it) designed to be a
> pull-based system that polls a configuration server.
It still is. Our machines run it once a day.
> The design
> was more focused on asynchronous updates, and I think most HPC
> folks would prefer a push model that updates everyone "at once."
I suppose we here don't mind the asynchronous nature, since we're
mostly running embarrassingly parallel single threaded jobs.
> Cfengine had a push system, but to me it didn't feel like a good
> fit with the rest.
It doesn't really have a push system. All it has is the ability to
trigger all the nodes to pull at once. Not something I use anyway.
> I'm more shocked that no one has written up using cfengine for
> managing laptops.
We use it for maintaining almost all of our Linux systems (2089
systems, according to last night's report I got from it). Not
laptops, but certainly Linux desktops.
I don't think cfengine is perfect, by any means, but it does what we
need for now. cfengine3 is going to be such a big change from
cfengine2 that I'm probably going to revisit the whole thing and see
if I want to change to something else.
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
More information about the Beowulf