[Beowulf] hpl size problems

Robert G. Brown rgb at phy.duke.edu
Fri Sep 30 08:22:44 PDT 2005


Chris Samuel writes:

> On Thu, 29 Sep 2005 09:40 pm, Joe Landman wrote:
> 
>> The view that I tend to espouse is
>>
>>         [core stuff          ] --> installed  (should be really small)
>>         [application stuff] --> mounted (make it any size you require)
>>
>> I came to this conclusion rather quickly after being asked to supply
>> bioperl and a few others that are not easily craftable into RPMs across
>> a cluster.
> 
> I'll second that emotion, we've done a central Perl build in /usr/local for 
> Bioperl as well and it works pretty well.

<rant>

And if either of you, or the bioperl maintainers, had done a central
perl build into well-structured rpms, the other one wouldn't have to to
have done their own build into /usr/local.  Right?

The whole point of /usr/local builds is that they are EVIL!  EVIL!
Especially if you're building for anything like a distributed
environment or want to maintain things in the future.  Everybody who
ever uses the tool has to build and install.  Everbody does it
differently.  This is not the Sufi Way.

Yes, learning to make RPMs out of a source is a PITA and will be
Annoying.  OTOH, the steps you go through are the ones that make the
result approximately portable, and portability is a desired trait of
anything you don't want to maintain by hand forever and be unable to
share.

The use of RPMs in the LSB isn't there "just" to give RH and advantage.
If it weren't RPMs it would HAVE to be some sort of packaging with
metadata, dependencies, encapsulated build instructions -- pretty much
everything that is in the RPM although one might well want more or might
want a little different (RPMs aren't perfect, as has been pointed out in
this thread several times:-).  Ideally one would go right back to the
original sources and get proper rpm build support added right to the
primary source packaging -- many applications these days do, after all,
come with a spec file and many people out there donate the time required
to do an rpm build of each revision out there on at least one or two
architectures (and produce a src rpm in the process).

IF the LSB were more problem free and ever got traction across the
rpm-based distros, a well-structured source rpm would just rpmbuild
--rebuild across distributions.  This would be such a lovely thing.

Note finally, that at least ONE way of building a binary-only rpm (one
that is commonly used to repackage commercial software so it can be
sanely distributed in a LAN or cluster environment) is to just skip the
build part.  Build and install the program however and whereever you
like, but pay attention to a) the dependencies; and b) the installed
files (installed according to the FHS, of course!).  Then build a
specfile that just plain takes the result and packages up the files --
so that rpmbuild doesn't do a make.  You can rpm package ANY set of
files, right, by brute force path if nothing else?  

Why, you ask, should you build an RPM like this?  It's a lot of work and
takes time!  Wouldn't it just be simpler to build it into a tarball and
de-tar it onto all the hosts?

SURE it would, except for:

  a) The tarball won't know about the dependencies, the RPM does because
you explicitly told it what it needed to run.  

  b) The tarball contains no "other" simply extractable metadata.  The
rpm might.  

  c) The tarball DEFINITELY isn't portable, but is unaware of this.  It
might be an x86_64 binary, who knows?  No metadata, remember?

  d) The tarball won't run a %post for you to configure it per host.  So
you get to do that by hand, if any per host configuration is needed.

  e) YOU get to invent the equivalent of yum (and packages, and groups,
and automated updates as revision numbers are bumped) for getting the
right tarball to the right kind of system and updated in a timely way.

  f) If you had to hack the sources around to get them to work on e.g.
an x86_64, you have no simple way to apply the patches to the next
version of the program to come down the pipe, so you'll at BEST have to
patch by hand, at worst re-edit by hand.

  g) You can't share it with anyone; the BEST you can do is tell them to
go get the sources and build it from there.  Or perhaps share with them
your hacked sources.  And because tarballs are blissfully revision
unaware, with something like perl especially if you get your revisions
out of kilter your install paths go to hell as well, so chances are your
hack will constantly need rehacking.

  h) The list goes on... how do you remove a tarball-installed set of
files safely (or at all!)?  update a tarball-installed set of files
safely or at all (requires removal, right, but maybe not removal of
EVERYTHING or you have to reconfigure)?  What if what you install is a
dependency of something else?  Your maintenance problem can grow
nonlinearly in time and new problems can emerge that cost you ten times
what it might have taken you to rpm package it JUST to get the
dependency resolution thing working both up and down stream.

  i) Finally, although the FHS permits the usage of /usr/local as you
describe, it is really very, very rare these days BECAUSE with rpms and
yum, management of /usr/local a la the 80's DOESN'T SCALE.  Really.
Even LOCALLY it almost never scales, globally it really really really
doesn't scale.  The FHS should have just plain old dumped /usr/local, in
my opinion.  That they didn't is PROBABLY to give developers a shadow of
/usr for development on one-off development hosts so that
--prefix=/usr/local -> --prefix=/usr does exactly the right thing
(whether autoconf'd or done by hand in a Makefile).

SO yes, yes, yes.  It requires religious discipline and a certain amount
of faith to convert EVERYTHING into an rpm, but we do it here.  If a
package is going to go onto more than a single system, it goes into an
rpm, per architecture or noarch as appropriate, and is put into a repo
so that all future installation, maintenance, upgrade, removal, and
dependency resolution is done by invisible monkeys, often while
sysadmins sleep.  It costs you on the leading edge, but you are paid
back many times over on the far edge and usually your work is PORTABLE
and USEFUL TO OTHERS which is the engine that makes Linux itself work in
the first place.  

That is, in a FAIR assessment of your benefits, you'd include the
benefits you ALREADY receive when you used an rpm-based distribution
written and maintained by tens of thousands of hands.  What if everybody
still used /usr/local?  We'd all still be using Suns, that's what.  Or
have given up and be running Windows.

The tarball->/usr/local paradigm (as truly local builds or an NFS
mounted /usr/local) as a means of LAN management needs to die die die.
IMHO, of course.  If you invest the effort of making the RPM-thing work
out, it will come back to you.  If you get together a GROUP of people to
make the RPM thing work out for a larger package, it will come back to
everybody several times over, as well as earn a place in most distros.

</rant>

    rgb
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050930/d5fc571e/attachment.sig>


More information about the Beowulf mailing list