Problems installing Scyld
broberts at mbhs.edu
Wed Jul 18 08:25:28 PDT 2001
I have been working on installing Scyld on a small subset of nodes on a
currently running cluster (just for testing purposes), but I have run
into several problems getting the slave nodes to boot properly.
I tried making a boot floppy for the nodes but that did not work (the
BIOS does not find a bootable floppy?) and so I've been using a preview
CDROM from late last year. The master node, however, was upgraded using
the RPMS on Scyld's ftp site after I installed it, but the boot
from the CD seems to go okay.
When I boot the nodes they seem to boot fine until they get to the point
where they start bpslave. That line looks okay and the nodes come up
with status 'error' (since they are not partitioned), but 10-15 seconds
later the kernel panics with an unknown paging request. Any commands
that try to run on the node before the panic (ifconfig, cat, etc.) fail
with the error that they are unable to open the C library. I tried
runnign beofdisk in the time I had before the panic but the lack of C
library made that fail.
I set up the master node following the installation guide as closely as
possible (but not using beosetup, I installed text-only) and am
reasonably certain that my boot image that the node uses is okay; I
generated it with beoboot -2 -n -k /boot/vmlinuz-2.2.17-33.beosmp (after
clearing out lots of modules in /lib/modules since they were eating tens
of megabytes) and the process seemed to go fine.
I have duplicated this failure on four nodes of the same configuration
(dual pentium II 450's, 256M RAM, DEC tulip-compatible NIC), and one
other node with somewhat newer hardware (two Pentium III 450's, 256M
RAM, 3c59x NIC) so it is not my hardware that is the problem.
I also wanted to comment how I noticed that even the newest release of
Scyld on the FTP site is still using the base packages from Red Hat 6.2,
as opposed to integrating the updated packages from Red Hat's Errata
page (including the numerous security fixes); I upgraded most of my
packages using Red Hat's updates except for some packages like glibc
which have been modified for the distribution. I would suggest
integrating some, if not all, of these fixes into the next distribution
given the seriousness of some of the bugs and the need for security on a
machine as tempting as a cluster.
"Gather your wits and hold on fast/ Your mind must learn to roam"
-- Gypsy Queen, "Tommy", The Who
More information about the Beowulf