Scyld Beowulf doesn't like Gigabyte GA-6vxdr7 motherboard

Carpenter, Dean Dean.Carpenter at pharma.com
Tue May 8 11:36:01 PDT 2001


Huh - interesting.  I just rebuilt a netboot image using the UP 2.2.17 from
Scyld ...

	beoboot -2 -n -k /boot/vmlinuz-2.2.17-33.beo -m
/lib/modules/2.2.17-33.beo/

Rebooted a compute node.  It comes up in UP as expected, but no NFS.
Checking the /var/log/beowulf/node.0 file, it was trying to load modules
(sunrpc specifically) from /lib/modules/2.2.19/misc.

Now the master node is running 2.2.19.  But why would the compute node try
to load 2.2.19 modules ?  I thought the beoboot script build a boot.img file
that contains the kernel and modules ...

Have to scan through beoboot ...

--
Dean Carpenter
Principal Architect
Purdue Pharma
dean.carpenter at pharma.com
deano at areyes.com
94TT :)


-----Original Message-----
From: Carpenter, Dean [mailto:Dean.Carpenter at pharma.com]
Sent: Tuesday, May 08, 2001 2:12 PM
To: beowulf at beowulf.org
Cc: 'David Vos'
Subject: RE: Scyld Beowulf doesn't like Gigabyte GA-6vxdr7 motherboard


OK.  Progress, but not in the right direction :)  Here's what I did, and
I'll be detailed so hopefully someone will notice what I
missed/typoed/screwedup ...

Got 2.2.19 from kernel.org, grabbed the bproc-2.2.tar.bz2 from Scyld.
Patched the kernel source - took a little tweaking, some things had changed.
But it appears to have gone in OK.

	make menuconfig

Turn all sorts of things, most unnecessary, but there to more or less match
up what the 2.2.17 menuconfig said.

	make dep
	make -j 4 bzImage
	make -j 4 modules
	make modules_install
	mv arch/i386/boot/bzImage /boot/vmlinuz-2.2.19

Copied the /boot/initrd-2.2.17-33.beosmp.img to /tmp/initrd-2.2.19.img.gz ,
gunzipped it, mounted it on /mnt.  Replaced the aic7xxx.o with the 2.2.19
version.  That was the only module being loaded for the master node.

	mount -o loop initrd-2.2.19.img /mnt
	cp /lib/modules/2.2.19/scsi/aic7xxx.o /mnt/lib
	umount /mnt
	gzip -9 /tmp/initrd-2.2.19.img
	mv /tmp/initrd-2.2.19.img.gz /boot/initrd-2.2.19.img

Added the 2.2.19 kernel and initrd to /etc/lilo.conf, and rebooted.  bproc
failures - not installed yet, but that was expected.

Now running 2.2.19 on the master node.  Built bproc stuff.  That seemed to
go OK as well.  The INSTALL file didn't quite seem to match the actual
though.

	make
	make install

Modules loaded cleanly.  Nice.  Copied the modules to the right place.

	cp vmadump/vmadump.o /lib/modules/2.2.19/misc
	cp ksyscall/ksyscall.o /lib/modules/2.2.19/misc
	cp bproc/bproc.o /lib/modules/2.2.19/misc

Rebooted to see that they load during the boot.  Works fine.  Nice.  So now
the master node is running 2.2.19 patched with bproc, and appears to be
fine.  Time to build a netboot stage 2 image.

	beoboot -d -2 -n -k /boot/vmlinuz-2.2.19 -m /lib/modules/2.2.19 >
/tmp/beoboot.txt 2>&1

Check the debug output.  Looks good, it grabbed 2.2.19 kernel and the right
modules.  OK, boot one of the new eval nodes - everything seems to go OK,
but only seems to.  As the stage 2 kernel boots, the screen goes black for
about 10 seconds, then it coldboots.  Dang it.  Redid the netboot image with
noapic just in case ...

	beoboot -d -2 -n -c noapic -k /boot/vmlinuz-2.2.19 -m
/lib/modules/2.2.19 > /tmp/beoboot.txt 2>&1

No go.  Same thing.  Dang it :(

My next step is to build a 2.2.19 kernel with only what's needed for the
master and compute nodes.  Although not completely homogenous, it will be
pretty close.  Another option is to try the latest Alan Cox 2.2.19 ...
Hmmm.  I think I'll grab that first - more chance of Via chipset fixes in
there.

These eval nodes came with Redhat 7.1 base install with 2.4.x kernel.  That
comes up fine in SMP mode, so that's another (albeit more painful) option.
How hard is it to patch bproc etc into 2.4.x ?




More information about the Beowulf mailing list