Scyld and Red Hat 7

Robert G. Brown rgb at phy.duke.edu
Sun Feb 4 07:23:04 PST 2001


On Wed, 31 Jan 2001, Mike Davis wrote:

> For a production server, I'm in complete agreement with Martin. The most
> important thing that a
> research computer can do is continually compute research. Flippant as
> that might sound, it is the
> truth. While I have upgraded my desktop and some webservers to RH7, I
> have no overwhelming
> desire to upgrade our cluster for the reasons mentioned.

It's anecdotal, to be sure, but after the RH 5.x->RH 6.x upgrade in our
department all my compiled research binaries ran some 20% faster.  We
made back the one day of downtime in one week of production, and of
course there were other tremendous benefits in even slightly broken 6.0
compared to 5.2.  There were library issues associated with upgrades as
well back then and all of these arguments were advanced and debated.

The tension between stability and improvement is as old as code itself.
Most people find a happy medium that is reasonably economic -- they get
things stable and productive and then leave them alone until their
friends start to make fun of them and then they upgrade, grumbling all
the while, get things stable, and then leave them alone (iterate
indefinitely).  As long as they have smart and helpful friends who live
close enough to the bleeding edge that it eventually is stabilized, this
is probably just fine.

It can easily be carried to a fault, though, as my anecdote makes clear.
We'd ALL pretty much make fun of somebody still installing and running
5.2 on brand new hardware (and only buying peripheral hardware from the
limited list of supported devices from that time), wouldn't we?  There
are real improvements associated with upgrades, and at some point it
becomes clearly worth it to pay the "cost" of the upgrade (time, hassle,
money, instability, recompiling, and so forth, which is actually pretty
damn minimal for RH based systems with kickstart) to gain the benefits.

Piecemeal upgrade isn't a good answer either, at least not in the long
run (although it is essential for prototyping an organizational
upgrade).  It becomes increasingly difficult to manage an "island" of
obsoleted systems in a sea of current ones for a variety of reasons:
the rpm incompatibility between 6.2 and 7.0, the hassle of tracking two
different update lists to ensure that your overall operation remains
secure (a step often skipped, but then lots of operations just aren't
particularly secure), the "missing application" problem when something
you get used to on the one distro isn't on the other, and in the case of
desktops, the lack of backwards compatibility in many of the X/gnome
improvements that really screw things up if one shifts between
distributions with a common NFS mounted home) it starts costing one MORE
time and MORE productivity to keep things heterogeneous than it would to
upgrade.  Homogeneity equals administrative scalability, and this
contributes to overall productivity too.

But you know all this -- I'm just trying to provide some perspective for
less experienced readers, so they don't get the impression that we're
linux-luddites of some sort who plan to be running 6.2 two years from
now...:-) So my point wasn't that everybody should stop everything and
upgrade to 7 NOW or that Scyld should do so right away -- it was that at
some point (the point where in the mind of the individual the costs and
benefits start to balance) one SHOULD upgrade, and that Scyld is likely
to do so when in their judgement that point is reached.

For what it's worth we haven't upgraded to 7.0 yet either, but at some
point in the not too distant future we will (possibly to 7.1 instead of
7.0).  We'll do it "all at once", by prototyping and thoroughly testing
a few archetypical systems, certifying a particular collection of rpm's
and updates that "works", and then using kickstart to simply convert
over all the department systems in a day (or at most two).  Not much
lost productivity, although we will also not be stupid and do this right
before some important event (like finals) when the systems HAVE to be up
in case there are problems.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list