[Beowulf] moving of Linux HDD to other node: udev problem at boot

Mikhail Kuzminsky kus at free.net
Wed Aug 19 10:12:41 PDT 2009

As it was discussed here, there are NUMA problems w/Nehalem on a set 
of Linux distributions/kernels. I was informed that may be old 
OpenSuSE 10.3 default kernel (2.6.22) works w/Nehalem OK in the sense 
of NUMA, i.e. gives right /sys/devices/system/node content.

I moved Western Digital SATA HDD w/SuSE 10.3 installed (on dual 
Barcelona server) to dual Nehalem server (master HDD on Nehalem 
server) with Supermicro X8DTi mobo.

But loading of SuSE 10.3 on Nehalem server was not successful. Grub 
loader (which menu.lst configuration uses "by-id" identification of 
disk partitions) works OK. But linux kernel booting didn't finish 
successfully: /boot/04-udev.sh script (which task is udev 
initialization) - I think, it's from initrd content -  do not see root 
partition (1st partition on HDD) ! 

At the boot I see the messages
SCSI subsystem initialized
ACPI Exception (processor_core_0787): Processor device isn't present
<a set of messages about usb>
Trying manual resume from /dev/sda2                      /* it's swap 
resume device /dev/sda2 not found (ignoring)
Waiting for device 
/dev/disk/by-id/scsi-SATA-WDC_WD<name_of_disk>-part1 ... /* echo from 
udev.sh */

and then the proposal to try again. After finish of this script I 
don't see any HDDs in /dev.

BIOS setting for this SATA device is "enhanced". "compatible" mode 
gives the same result.

What may be the source of the problem ? May be HDD driver used by 
initrd ?   

Mikhail Kuzminsky
Computer Assistance to Chemical Research Center
Zelinsky Institute of Organic Chemistry RAS
PS.  If I see (after finish of udev.sh script) the content of /sys - 
it's right in NUMA sense, i.e.
/sys/devices/system/node contains normal node0 and node1.

More information about the Beowulf mailing list