Scyld 27Z-8 Gig Net - HELP!
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Karen Keadle-Calvert calvert at scyld.comThu Sep 26 08:35:50 PDT 2002
- Previous message: Scyld 27Z-8 Gig Net - HELP!
- Next message: Scyld 27Z-8 Gig Net - HELP!
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Stanley, I know you said you modified all of the files, but just to review, under 27z-8, you need to modify the file /etc/beowulf/config.boot to add the device and vendor information for the newer e1000 card. So you'll need to add the following line: pci 0x8086 0x100E e1000 In addition, make sure you have a 'bootmodule' entry for "e1000" near the beginning of the file. Next rebuild your node boot floppy and beoboot images and try rebooting. If you've already done all of that (which it sounds like you have), then attached are some directions for building an e1000 driver under Scyld. Hopefully, this solves your problem. Regards, Karen Stanley, Matthew D. wrote: >I have several clusters running the public release of 27Z-8. They have been, up until now exclusively via-rhine and 3c59x based 100mbit clusters. We wanted to upgrade to gigabit ethernet and decided to upgrade our 4 machine cluster using Dlink DGE-500T cards (ns820/ns83820 based). I compiled the latest netdrivers.tgz file and the ns820 driver appeared to work fine as a link to the outside world but did not function on the beoboot floppy even though I compiled for that kernel and even did a full kernel set rebuild (rpm -bb) including the new netdrivers.tgz file. What happened was right after it would find the card, find the master server and assign the IP address it would just sit at the line where it requests /var/beowulf/boot.img. > >Ok, so I gave up on Dlink cards, and purchased 4 Intel PRO/1000MT cards, the new version which requires the new release of drivers since it's PCI id is 8086:100E and not 8086:1000. I again compiled the drivers and tested the card to the internet side with 0 problems. I then create my boot images and try to boot, it gets a little farther than the Dlink, it will actually starts to boot the net boot image and then locks up and never completes. > >Am I missing something here? Ive modified all of the files, it finds the cards, it even works for days on the internet if I switch my card to the eth0 and not eth1. It appears to be a driver issue yet I have similar problems with two completely different sets of cards. I have even tried using a 100 mbit hub instead of a gigabit switch with identical results. I can also just take out the cards and put in 3c59x cards and the problem is fixed! > >We use our clusters for NAMD only, is there a way to just install full versions of Scyld and then execute bpslave? If so, what modifications need to be done to the node_up and other scripts to make that work. I realize this means more administration, but at this point I have spent weeks trying to make this work, I can install and update 4 machines in a matter of a couple hours. > >Are there settings in beoboot which changes the way it gets the information from the master node, maybe making it more reliable like broadcast/multicast, etc? > >Any help would be appreciated, > >Matt Stanley >Systems Administrator >Structural Biology Core >University of Missouri - Columbia >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf > > -------------- next part -------------- HOW TO ADD DRIVERS - Example shown for Intel Pro/1000 series gigabit adapters ------------------ => If available, get the prebuilt modules for the appropriate kernel from: ftp://www.scyld.com/pub/beowulf/<version>/updates For example, for the 2.2.19-12 kernel: ftp://www.scyld.com/pub/beowulf/27z-8/updates/e1000-3.6.8.1.tar.gz => If not available, download source code for driver. The Intel Pro/1000 series driver can be found at ftp://www.intel.com/df-support/2897/eng or http://downloadfinder.intel.com/scripts-df/Product_Filter.asp?ProductID=415 or http://support.intel.com/support/go/linux/e1000.htm NOTE: If the kernel source rpm was not installed, you'll have to do that first. It is installed by default under 27cz-9, but not under 28cz-8-beta2. The kernel source is available on the distribution CD under Scyld/RPMS/kernel-source-2.4.9-21.1.i386.rpm => Add this line to the beginning of the Makefile CFLAGS = $(KCFLAGS) => Make the beoboot, SMP, and UP modules for the version of the Scyld kernel that you are running under (27cz-9 shown here): > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__module__beoboot" > mv e1000.o /lib/modules/2.2.19-14.beobeoboot/net > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__BOOT_KERNEL_SMP=1" > mv e1000.o /lib/modules/2.2.19-14.beosmp/net > make KCFLAGS="-D__BOOT_KERNEL_H_ -D__BOOT_KERNEL_UP=1" > mv e1000.o /lib/modules/2.2.19-14.beo/net => Add new entries for this module to the PCI table 1. Add, if necessary, the following bootmodule entry to the configuration file (in /etc/beowulf/config.boot for 27cz-9 and /etc/beowulf/config for 28cz-4): bootmodule e1000 2. Add entries to the device list for each device supported by this driver (in /etc/beowulf/config.boot for 27cz-9 and /usr/share/kudzu/pcitable for 28cz-1): pci 0x8086 0x1000 e1000 pci 0x8086 0x1001 e1000 pci 0x8086 0x1004 e1000 pci 0x8086 0x1008 e1000 pci 0x8086 0x1009 e1000 pci 0x8086 0x100c e1000 => Build the dependency file (for each kernel) used by modprobe to load the correct module: For single processor kernel: depmod -a -e -F /boot/System.map-2.2.19-14.beo 2.2.19-14.beo For SMP (more than one processor machine) kernel: depmod -a -e -F /boot/System.map-2.2.19-14.beosmp 2.2.19-14.beosmp For beoboot kernel (Stage 1 image): depmod -a -e -F /boot/System.map-2.2.19-14.beobeoboot 2.2.19-14.beobeoboot => Rebuild the Phase 1 and Phase 2 kernel images: /usr/bin/beoboot -1 -f -o /dev/fd0 -c "apm=power-off" /usr/bin/beoboot -2 -n -k /boot/vmlinuz-`uname -r` -o /var/beowulf/boot.img -c "apm=power-off" NOTE: ---- If your master node is single processor and your compute node is SMP, and you don't have a SMP kernel installed, you'll have to get the RPM from the distribution CD and install it (using rpm -U). This happens when you install on a single processor machine because the installer selects the kernel to be installed based on the machine being installed on. You must run the same kernel on all of the machines in the cluster. The SMP kernel can run on both single processor and SMP machines.
- Previous message: Scyld 27Z-8 Gig Net - HELP!
- Next message: Scyld 27Z-8 Gig Net - HELP!
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
