[Beowulf] OFED/IB for FC8
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Michael H. Frese Michael.Frese at NumerEx-LLC.comWed Jun 4 15:52:14 PDT 2008
- Next message: [Beowulf] OFED/IB for FC8
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Following Jeff Layton's post to this list [Cheap SDR IB] on January 28, we purchased 8 Infinihost LX's and an 8 port switch, and began trying to get the OpenFabrics (OFED) release of MVAPICH for Fedora Core 6 to run on our new machines. We develop and run a multiphysics code in a relatively fine grain parallel mode where latency dominates the performance scaling, so it seemed like a good thing to try. This is our first exposure to InfiniBand, though we have considerable experience with MPI, both in-memory and over GigE, including using netpipe to measure latency and bandwidth. Those machines have AMD Athlon X2 6000+'s on Asus M2N-SLI Deluxe motherboards with an open PCI Express slot that will handle x4. The main issue is that we are presently running Fedora Core 8 and the 2.6.21 SMP kernel, but there is no OFED release for FC8 yet. Is anyone else working on this? Has anyone succeeded at getting it to work? We started with OFED version 1.2.5 from http://www.openfabrics.org/downloads/OFED/ofed-1.2.5/OFED-1.2.5-RPMS/ We downloaded all the rpms from redhat-release-4AS-6.1 version. In particular the kernel rpms are kernel-ib-devel-1.2-2.6.9_55.ELsmp and kernel-ib-1.2-2.6.9_55.ELsmp. We used the 1.2.5 version because there don't seem to be any rpms for the 1.3 version. All the OFED rpm's for FC6 installed on FC8 without difficulty, except for opensm-3.0.3-0.ppc64.rpm It didn't say "missing dependencies ..." It just got stuck. We had to kill the 'rpm -ivh', remove the lock file and rebuild the rpm database. After that, # lsmod | grep ib shows about 15 IB related kernel mods. Even so, at this point, some of the IB stuff works. We can run ibnetdiscover and see the HCA's on the two machines that have the rpm's installed, and the switch, too. We could use that to make a topology file, but we don't know where to put it, or even if we should put it somewhere. We can run ibchecknet, and though it finds 4 nodes, it says they are all bad. It also reports "lid 0 address resolution: FAILED". We have not succeeded in getting ibping to work, and aren't really sure what how to specify the remote address for it. We found /usr/share/doc/ofed-docs-1.2/README.txt /usr/share/doc/ofed-docs-1.2/OFED_Installation_Guide.txt and, as described there, did # /etc/init.d/openibd start Loading QLogic InfiniPath driver: [FAILED] Loading HCA driver and Access Layer: [ OK ] Setting up InfiniBand network interfaces: Failed to configure IPoIB connected mode for ib0 Bringing up interface ib0: [FAILED] Setting up service network . . . [ done ] Loading ib_sdp [FAILED] Loading ib_vnic [FAILED] Module ib_vnic not loaded. Bringing up VNIC interfaces [FAILED] That mostly looks bad. Does anyone have any suggestions? We are willing to try a build from source, but we are unsure of what challenges might lie down that path. We'd rather not fall back to FC6, but we may have to do that. Thanks for your help. Mike Frese -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.scyld.com/pipermail/beowulf/attachments/20080604/085e9134/attachment.html
- Next message: [Beowulf] OFED/IB for FC8
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
