Installing Linux (without CD/floppies)

Robert G. Brown rgb at
Tue Feb 18 11:57:21 PST 2003

On 18 Feb 2003, Ashley Pittman wrote:

> It's not an assumption I've ever tested to be honest,  my time would
> probably be better spent refining the list of packages that get
> installed, and then wasted again in a few weeks time when I realise that
> in actual fact I do need a compiler on every node.  So, I just going
> leave it as it is and if I ever am in a situation where I'm waiting for
> a node re-install then I will use the time to drink more coffee.

Definitely the best plan.  Although (as I'm pointing out in a footnote
in my latest revision of the online book:-) Beowulfery!! by scary
coincidence is an anagram of yow! Beerful! so you might consider beer
instead of coffee.  I mean, how could this be mere chance?

> I would use "apt-get" rather than "yum" by choice but I don't want to go
> anywhere near that argument :)

Especially not with me, since my good friend and colleage Seth Vidal is
the godfather of yum and just keeps making it smarter, faster, better.
I used to have a modest case of apt-envy because of the incredible
clunkiness of managing dependencies and the like with rpm's.  No more.

Anybody can trivially set up a package site to support rpm interactive
or kickstart installs and yum.  The last guy to try it needed a tiny bit
of help from the yum list (mostly because he was setting up RH 8 but
using the wrong yum source) and had everything working within a few
hours of starting.

At this point it is religious.  Observe.  I hear about "festival" which
might or might not have something to do with speech recognition.  If it
is, I want it on my laptop so I can get it to read me stories out loud.

I start by checking to see if it is available.  This doesn't have to be
done as root, but why not?

rgb at lilith|T:2#yum list fest\*
Gathering package information from servers
Getting headers from: Dulug 7.3
Finding updated packages
Downloading needed headers
Looking in Available Packages:
Name                                     Arch       Version
festival                                 i386       1.4.2-3
festival-devel                           i386       1.4.2-3
Looking in Installed Packages:
Name                                     Arch       Version

so yes it is available, no it isn't already installed.  Is it what I
think that it is?

rgb at lilith|T:3#yum info festival
Gathering package information from servers
Getting headers from: Dulug 7.3
Finding updated packages
Downloading needed headers
Looking in Available Packages:
Name   : festival
Arch   : i386
Version: 1.4.2
Release: 3
Size   : 69.82 MB
Group  : Applications/Multimedia
Summary: A free speech synthesizer
 Festival is a general multi-lingual speech synthesis system developed
at CSTR. It offers a full text to speech system with various APIs, as
well as an environment for development and research of speech synthesis
techniques. It is written in C++ with a Scheme-based command interpreter
for general control.

Yup, looks like it is indeed text to speech translation.  I'll be able
to make my laptop talk (within reason) if I install it.  So I decide to
go with it.
rgb at lilith|T:4#yum install festival festival-devel
Gathering package information from servers
Getting headers from: Dulug 7.3
Finding updated packages
Downloading needed headers
Resolving dependencies
Dependencies resolved
I will do the following:
[install: festival.i386]
[install: festival-devel.i386]
Is this ok [y/N]: y
Getting festival-1.4.2-3.i386.rpm
Calculating available disk space - this could take a bit
festival 100 % done 
festival-devel 100 % done 
Installed:  festival.i386 festival-devel.i386
Transaction(s) Complete
21.902user 10.330sys 4.0%, 0ib 0ob 0tx 0da 0to 0swp 13:10.38

and I'm done, where the relatively lengthy time is because this
transaction was bottlenecked by both DSL and wireless onto my laptop,
and festival is quite large (28 MB;-).

At this point maintenance is automatic.  If somebody installs an updated
festival-1.5.2-4.i386.rpm in the 7.3 install tree, my system will notice
this and automatically update it in a nightly cron script.  If I decide
that my beowulf needs talking nodes for some reason, I can add it to
their kickstart file and do a yum install on all the nodes.  I can
change my mind and do a "yum remove festival". I basically don't need to
use rpm at all except to build or rebuild rpm's or install rpm's that
aren't in the repository.  I can even LIST all the rpm's installed that
AREN'T in the repository to make sure to put them there before

Oh, yes, I love yum...:-)

> > Yeah, that's the rub;-) Once things are bleeding edge efficient and
> > scalable, it stops being worth it to screw around saving even
> > significant fractions of the little time things take, especially
> > unattended time.  Five minutes or ten minutes of network install time
> > per node is pretty irrelevant, as long as I don't have to wait around
> > for either one to complete and as long as both complete without incident
> > on request.
> It's only irrevelevent sometimes, you say yourself that the COD project
> is aiming for "minutes" to do an install.  It all depends on how often

It's aiming for minutes only because that's realistic for moving OS
images around networks and then doing a boot.  We've talked some about
ways to speed it up that might work in some cases (e.g. pre-installing
or caching some of the images you expect to switch to) but they come
with problems (verifying that the images you left there are untouched
and match the ones you want to run, for example) and there will always
be worst-case situations where you have to transfer a GB image across
the network, decompress it, boot into it.

> you see yourself re-installing, for most cluster people it can probably
> be classed as seldom.

Agreed, although in a CPS department several groups could be sharing a
cluster in such a way that required a COD-mediated reboot every few
hours, perhaps into sandbox images associated with particular classes or
research projects.  That's why we'd like it held to at >>most<< a few
minutes to reboot an entire cluster.

> Of course in the nfs-root world there is no such thing as "installing",
> you just change the export/mount options (assuming you share the root fs
> across machines).

And this is fine, presuming that you're booting an environment that
supports diskless nfs-root operation.  Or (within COD) one might boot a
small root that is transferred/installed to the node and mount
everything else.  There is a lot of room for clever ideas.

> There is also a distinction between diskless and nfs-root, I chose to
> use nfs-root on my home machines because it allows greater flexibility
> in what software is running but I still have swap, /tmp and /local
> running of the local hard disk.  In the days before grub it was really
> handy to get the kernel over the network to.

Again, this is a tough call.  For heterogeneous systems, one has to be
able to support different system dependent identity dependent
configurations in e.g. /etc/X11, /etc/conf.modules, /var and elsewhere.

In the old days (when disk was dear and Sun ruled the earth:-) one
typically installed a small-disk configuration with a local (but small)
root partition that contained the kernel, /etc, /var, /tmp all local,
and then mounted /usr (ro), /home/whatever (rw), and often /usr/local
(ro) (and oddly enough) from a server.  Each system could then write its
own logs and spools in its own local /var, knew its identity and local
configuration in its own local /etc, but shared pretty much the entire
library and execution space.  /bin and /sbin tended to be local as well,
so the system would function single user without /usr mounted.

A fully diskless configuration, e.g. a Sun ELC or SLC, would typically
start with its own system-specific root partition, one per host, with
its own /var and /etc and so forth, then proceed as before mounting
/usr, /home/whatever, /usr/local, on top of it.  The downside was
building all of those / partitions for hosts and keeping track of them
and ensuring that the right one was mounted on the right host.  Not
terribly difficult for ten diskless hosts on a LAN, not terribly easy or
scalable for a hundred, even allowing for a lot more homogeneity than
one typically sees with vanilla OTC PC's as nodes OR desktops.

> > I think that this will be very, very cool and might even change the very
> > way we think about compute resources from the desktop on down.
> I agree, this is very cool and definatly the way forward, not just in
> clustering either. Linux-bios will help a great deal to.

Absolutely.  BIOS management is currently maddening and is a major
reason that floppyless, headless, diskless nodes aren't much of an
option.  Most of the times we've tried this route we've gotten stuck
with the "and now you need to reflash, reset, reconfigure the bios"
problem and found ourselves moving a floppy and video card into one node
at a time.  Even nodes with serial consoles (e.g. Tyan 2466's) aren't
any help if the bios resets into a no-serial-console default when they
are reflashed.  Then there is the PXE setup issue, where a PXE chip is
preset to hang forever (until a key is pressed) on a PXE boot instead of
time out into the bios boot order chain.  One has to boot into a control
program, typically from a DOS diskette, and have a floppy, keyboard and
monitor attached to configure a node not to need a floppy, keyboard and
monitor.  Sigh.

All that time, of course, REALLY adds on to the human energy required to
install and manage a node.  To the point where we don't do floppyless
nodes, videoless nodes, and so forth any more.  The extra $40 or so for
the hardware is cheaper than the extra hour of time per node required to
shuffle a floppy drive and video card through them every time a BIOS
flash etc. is needed.


> Ashley,

Robert G. Brown	             
Duke University Dept. of Physics, Box 90305
Durham, N.C. 27708-0305
Phone: 1-919-660-2567  Fax: 919-660-2525     email:rgb at

More information about the Beowulf mailing list