Scyld and Red Hat 7

Robert G. Brown rgb at phy.duke.edu
Thu Feb 1 04:35:09 PST 2001


On Thu, 1 Feb 2001, stig wrote:

> As long as the system includes the main libs, a kernel and the popular
> package managers (well RPM)  does it really matter what distribution it is
> based on?
>
> Would there be this discussion if they 'based' it on their own compilation
> of binaries instead of those of RedHats.

The reasons to periodically upgrade an operating system distribution
(theirs or anybody else's), and not just the kernel, are many and valid.
By the numbers:

  a) Improved compilers and support libraries.  This is probably the
number one reason to upgrade a whole distribution rather than just the
kernel.  Sure, you can just upgrade compilers alone, and kernels alone,
and libraries alone, but at some point (especially for major e.g. libc
revisions) you find that you have to rebuild everything anyway and the
whole point of distributions and kickstart and yellow dog's "yup" tool
is to make it easy to get from tested configuration to tested
configuration.  I've done systems management piecemeal and it is no fun
at all.

This is currently a highly nontrivial reason in my mind.  I'm in the
middle of fixing an extremely serious bug in the cpu-rate tool I've been
using to measure floating point performance on nodes and have uncovered
a rat's nest of wierdness somewhere in the gcc/linux interaction on 6.2
systems.  As in I can run the same benchmark code with the same
parameters and get two completely different timings, depending literally
one whether I set a parameter by a fallthrough default or "override" the
parameter to the exact same value on the command line.  Or change the
order of initialization statements.  Different by a factor of two -- not
a small difference.  This SEEMS to be fixed in RH 7.0 although I'm still
testing.

  b) Improved kernel.  For example, NFS is basically and maddeningly
broken in pre-2.18 kernels (but MAY be fixed in 2.18) -- I've actually
survived a server crash without having to reboot all my NFS clients
since upgrading my (non-scyld) cluster.  Yes, one can rebuild the kernel
by hand, but some of the scyld advantages (and other useful beowulf
stuff) interface directly with the kernel.  These days one sometimes has
to upgrade the base compiler to upgrade the kernel.  This is less
important to a scyld beowulf than to a more general purpose cluster
node, but scyld cannot remain stagnant at a given kernel revision
forever.

  c) Improved everything else.  This isn't too important to scyld but
again, even e.g. MPI marches along.  Bugs are fixed, optimizations are
tuned.  Scyld may not have to remain sync'd to RH's development cycle,
but it has to re-release its OWN distribution package periodically to
keep everything up to date and/or users will have to periodically
upgrade node or server packages piecemeal.

RH 7 has definitely got some problems, but 7.1beta comes out what,
today? and reportedly fixes a lot of those problems (as do the many
updates already released).  Since RH 7 has an incompatible RPM relative
to 6.2, the 6.2->7 upgrade requires a pretty serious commitment and lots
of folks are holding off until its problems diminish.

I therefore don't think that the issue is whether scyld should rebuild
on the 7.x distribution -- it is rather a question of when.  This is
thus a reasonable question to ask, although there is (as noted) less
pressure for them to do it immediately.  There is also the question of
how difficult it is to do the rebuild -- if the distribution is RPM
packaged, rebuilding really shouldn't take long at all; it is the
testing and stabilizing that takes the time.

   rgb

-- 
Robert G. Brown	                       http://www.phy.duke.edu/~rgb/
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at phy.duke.edu







More information about the Beowulf mailing list