Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Clusters and Distro Lifespans

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Stu Midgley sdm900 at gmail.com
Wed Jul 19 07:44:17 PDT 2006


> > ii) Would it be better to develop our own installation process for
> > clusters so that upgrades, in terms of distros, can be rolled out
> > easily?  I feel like i'm tied in some way to the supplier of our
> > cluster for upgrades.
>
> Hmmm.... you would be re-inventing this wheel, which has been
> re-invented many times.

Yes and each person who re-invents the wheel then knows how a wheel
works and isn't stuck using the same wheel regardless of its
limitations.  I'm a strong advocate of building your own install
method.  1) Its fun 2) its easy 3) Your learn a lot 4) you can
integrate your method into your own management style.

eg. we are able to do a rolling upgrade of our cluster without
downtime and without users even noticing.  We have it tightly coupled
with our queueing software and when a node becomes free it gets
re-installed.

We also have our install process configured to allow booting different
distros/images, which is useful to boot diagnostic cd images etc.


> > iii) Do people regularly upgrade their clusters in relation to
> > distros?  I guess this is like asking how long is a piece of string
> > because everyone's needs are different.
>
> Cluster upgrades are rare unless you are missing functionality or
> something is broken.  That is of course one opinion, some here do
> upgrades nightly.  From a purely production oriented viewpoint, where
> downtime == lost money for our customers, we usually advise against that.

I think rare is a strong word.  Infrequent may be better.  We
regularly apply patches and upgrades to the front end nodes (globally
connected) and infrequently (~ every 6 months) upgrade all the cluster
nodes in the rolling fashon mentioned above.

You can even do a kernel upgrades to the file servers/front end nodes
(which requires a reboot) without killing or disrupting jobs.  Having
complete control has a lot of benefits.


-- 
Dr Stuart Midgley
sdm900 at gmail.com



More information about the Beowulf mailing list