[Beowulf] noob Root-NFS question

Cocoy Dayao cocoy.dayao at gmail.com
Sun May 3 15:25:16 PDT 2009


Dear list,

I'm not sure if this is the right forum for this. But anyway here  
goes. i hope you could help me out. So I wanted to play around with  
building my own Beowulf Cluster. Yes, I know there are easier ways to  
do this. Automated tools like Caos Linux, etc. etc. Anyway, I wanted  
to use gentoo and learn from the ground up. I wanted a diskless  
client. So I booted a box via pxe, tftp and dhcp. it boots--- but  
can't seem to find NFS.

And yes, i've googled. none have worked. I don't know what I'm missing.

I get this:

rpcbind: server 192.168.2.1 not responding, timed out
Root-NFS: Unable to get nfsd port number from server, using default
Looking up port of RPC 100005/1 on 192.168.2.1
rpcbind: server 192.168.2.1 not responding, timed out
Root-NFS: Unable to get mountd port number from server, using default
Root-NFS: Server returned error -5 while mounting /diskless/192.168.2.11
VFS: Unable to mount root fs via NFS, tryi9ng floppy.
VFS: Cannot open root device "nfs" or unknown-block(2,0)
Kernel Panic - not syncing: VFS unable to mount root fs on unknown- 
block(2,0)

Both client and server have root nfs turned on it their respective  
kernel.

i have turned off the firewall on the server and still get the same  
error.

pxelinux.cfg is this:

DEFAULT /kernel8
APPEND root=/dev/nfs rw nfsroot=192.168.2.1:/diskless/192.168.2.11  
init=sbin/init

this is rpcinfo:

talon dhcp # rpcinfo -p 192.168.2.1
    program vers proto   port
     100000    4   tcp    111  portmapper
     100000    3   tcp    111  portmapper
     100000    2   tcp    111  portmapper
     100000    4   udp    111  portmapper
     100000    3   udp    111  portmapper
     100000    2   udp    111  portmapper
     100024    1   udp  45975  status
     100024    1   tcp  57882  status
     100005    1   udp  57290  mountd
     100005    1   tcp  50765  mountd
     100005    2   udp  57290  mountd
     100005    2   tcp  50765  mountd
     100005    3   udp  57290  mountd
     100005    3   tcp  50765  mountd
     100003    2   udp   2049  nfs
     100003    3   udp   2049  nfs
     100021    1   udp  57739  nlockmgr
     100021    3   udp  57739  nlockmgr
     100021    4   udp  57739  nlockmgr
     100021    1   tcp  45392  nlockmgr
     100021    3   tcp  45392  nlockmgr
     100021    4   tcp  45392  nlockmgr
     100003    2   tcp   2049  nfs
     100003    3   tcp   2049  nfs

ps -aef | grep rpc is this:

alon conf.d # ps -aef | grep rpc
root      1101     2  0 18:14 ?        00:00:00 [rpciod/0]
root      1102     2  0 18:14 ?        00:00:00 [rpciod/1]
root      8332     1  0 18:15 ?        00:00:00 /sbin/rpcbind
nobody    8356     1  0 18:15 ?        00:00:00 /sbin/rpc.statd --no- 
notify
root      8379     1  0 18:15 ?        00:00:00 /usr/sbin/rpc.mountd
root      8587  8560  0 18:22 pts/0    00:00:00 grep --colour=auto rpc

tcpdump:

17), length 57) master.talon.11978 > node01.talon.57100: UDP, length 29
17:39:47.683582 IP (tos 0x0, ttl 64, id 53556, offset 0, flags [DF],  
proto UDP (17), length 52) master.talon.11974 > node01.talon.57099:  
UDP, length 24
17:39:48.451700 IP (tos 0x0, ttl 64, id 54326, offset 0, flags [DF],  
proto UDP (17), length 57) master.talon.11976 > node01.talon.57100:  
UDP, length 29
17:39:49.665576 IP (tos 0x0, ttl 64, id 63547, offset 0, flags [DF],  
proto UDP (17), length 57) master.talon.11978 > node01.talon.57100:  
UDP, length 29
17:39:49.762700 IP (tos 0x0, ttl 64, id 55637, offset 0, flags [DF],  
proto UDP (17), length 57) master.talon.11975 > node01.talon.57100:  
UDP, length 29
17:39:50.661534 arp who-has node01.talon tell master.talon
17:39:51.662530 arp who-has node01.talon tell master.talon
17:39:52.401575 IP (tos 0x0, ttl 64, id 58276, offset 0, flags [DF],  
proto UDP (17), length 57) master.talon.11977 > node01.talon.57100:  
UDP, length 29
17:39:52.662526 arp who-has node01.talon tell master.talon
17:39:54.471660 arp who-has node01.talon tell master.talon

arp--- that's the point where kernel panic occurs.

this is my /etc/exports file:

#/etc/exports: NFS file systems being exported.  See exports(5).
/diskless/192.168.2.11   
*(rw,no_root_squash,no_all_squash,no_subtree_check)
/opt    192.168.2.0/24(ro,no_root_squash,no_all_squash,no_subtree_check)
/usr    192.168.2.0/24(ro,no_root_squash,no_all_squash,no_subtree_check)
/home   192.168.2.0/24(rw,no_root_squash,no_all_squash,no_subtree_check)

/var/log         
192.168.2.11(rw,no_root_squash,no_all_squash,no_subtree_check)

my dhcp configuration is this:

# my dhcpd.conf for diskless clients
allow booting;
#allow bootp;

#tftp
next-server 192.168.2.1;
#option root-path "/diskless/192.168.2.11";

option space PXE;
option PXE.mtftp-ip               code 1 = ip-address;
option PXE.mtftp-cport            code 2 = unsigned integer 16;
option PXE.mtftp-sport            code 3 = unsigned integer 16;
option PXE.mtftp-tmout            code 4 = unsigned integer 8;
option PXE.mtftp-delay            code 5 = unsigned integer 8;
option PXE.discovery-control      code 6 = unsigned integer 8;
option PXE.discovery-mcast-addr   code 7 = ip-address;

subnet 192.168.2.0 netmask 255.255.255.128 {
         range 192.168.2.11 192.168.2.20;
         option domain-name-servers 192.168.2.1;
         option domain-name "talon";
         option routers 192.168.2.1;
         option broadcast-address 192.168.2.195;
         option root-path "192.168.2.1:/diskless/192.168.2.11";
         default-lease-time 600;
         max-lease-time 7200;
         next-server 192.168.2.1;

         class "pxeclient" {
                 match if substring (option vendor-class-identifier,  
0, 9) = "PXEClient";
                 vendor-option-space PXE;

                 option PXE.mtftp-ip 0.0.0.0;
                 #option PXE.mtftp-ip 192.168.2.1;
                 filename "pxelinux.0";
         }
       # host decleration for diskless node

        host node01.talon {
                                 hardware ethernet 00:1c:c0:4f:bd:e1;
                                 fixed-address 192.168.2.11;
         }
}

I also tried this: i used the same cable and attached it to my mac and  
mounted /diskless/192.168.2.11 nfs share which points to the diskless'  
client's root. i was able to mount it. So i know nfs works. And yes i  
made sure to turn off the mac's wireless... so only ethernet was  
plugged in.

What did I miss?

Appreciate, your advice

Cocoy
www.twitter.com/cocoy
"People who are really serious about software should make their own  
hardware" -- Alan Kay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.scyld.com/pipermail/beowulf/attachments/20090504/548d010a/attachment.html


More information about the Beowulf mailing list