[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

Prentice Bisbal pbisbal at pppl.gov
Tue Apr 30 14:42:47 PDT 2019


One more thing:

https://github.com/intel/psm

Prentice

On 4/30/19 5:41 PM, Prentice Bisbal wrote:
>
> I agree with Gus that you should be asking this question on the 
> OpenMPI mailing list where there's more expertise specific to 
> OpenMPI.  Based on your error message
>
>> mca_base_component_repository_open: unable to open
>> mca_btl_usnic: libpsm_infinipath.so.1: cannot open shared object file:
>> No such file or directory (ignored)
>
> It looks like it's not a problem with IB libraries in general, but 
> that you are missing the PSM libraries. PSM is an additional library 
> used by QLogic cards, so installing Mellanox OFED will not help you 
> out here. In fact, it will probably just make things worse. Since 
> Intel bought QLogic a while back, I think you need to install the 
> Intel PSM RPMs.
>
> If you are not using QLogic cards in your cluster, you will need to 
> rebuild OpenMPI without PSM support.
>
> Did you build this software on one system and install it on a shared 
> filesystem, or copy it to the other nodes after the build? If so, the 
> build system probably has the proper libraries installed. The 
> configuration command-line for this build doesn't explicitly mention PSM:
>
>> Configure command line: '--prefix=/usr/lib64/openmpi3'
>>  '--mandir=/usr/share/man/openmpi3-x86_64'
>>  '--includedir=/usr/include/openmpi3-x86_64'
>>                            '--sysconfdir=/etc/openmpi3-x86_64'
>>                            '--disable-silent-rules' 
>> '--enable-builtin-atomics'
>>                            '--enable-mpi-cxx' '--with-sge' 
>> '--with-valgrind'
>>                            '--enable-memchecker' '--with-hwloc=/usr' 
>> 'CC=gcc'
>>                            'CXX=g++' 'LDFLAGS=-Wl,-z,relro ' 'CFLAGS= 
>> -O2 -g
>>                            -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 
>> -fexceptions
>>                            -fstack-protector-strong 
>> --param=ssp-buffer-size=4
>>                            -grecord-gcc-switches   -m64 -mtune=generic'
>>                            'CXXFLAGS= -O2 -g -pipe -Wall
>>                            -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>                            -fstack-protector-strong 
>> --param=ssp-buffer-size=4
>>                            -grecord-gcc-switches   -m64 -mtune=generic'
>>                            'FC=gfortran' 'FCFLAGS= -O2 -g -pipe -Wall
>>                            -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>                            -fstack-protector-strong 
>> --param=ssp-buffer-size=4
>>                            -grecord-gcc-switches   -m64 -mtune=generic'
>
> When something is included that isn't explicitly stated on the 
> configuration line, that usually means the configure script looked for 
> the supporting libraries, found them, and so decided to include the 
> support for that feature.
>
>
> Prentice
> On 4/30/19 2:28 PM, Gus Correa wrote:
>> HI Faraz
>>
>> My impression is that you're missing the IB libraries, and that Open MPI
>> was not built with IB support.
>> This is very likely to be the case if you're using the Open MPI 
>> packages from CentOS (openmpi3.x86_64, openmpi3-devel.x86_64)
>> which probably only have TCP/IP support built in (the common 
>> denominator network of most computers).
>> Building Open MPI from source is not difficult, and a must if you 
>> have IB cards.
>>
>> Notwithstanding the MPI expertise of the Beowulf mailing list 
>> subscribers,
>> if you post your message in the Open MPI mailing list, you'll get 
>> specific and detailed help in no time,
>> and minimize the suffering.
>>
>> My two cents,
>> Gus Correa
>>
>> On Tue, Apr 30, 2019 at 12:28 PM Faraz Hussain <info at feacluster.com 
>> <mailto:info at feacluster.com>> wrote:
>>
>>     Thanks, here is the output below:
>>
>>     [hussaif1 at lustwzb34 ~]$ ompi_info
>>     [lustwzb34:10457] mca_base_component_repository_open: unable to open
>>     mca_btl_usnic: libpsm_infinipath.so.1: cannot open shared object
>>     file:
>>     No such file or directory (ignored)
>>     [lustwzb34:10457] mca_base_component_repository_open: unable to open
>>     mca_mtl_ofi: libpsm_infinipath.so.1: cannot open shared object file:
>>     No such file or directory (ignored)
>>     [lustwzb34:10457] mca_base_component_repository_open: unable to open
>>     mca_mtl_psm: libpsm_infinipath.so.1: cannot open shared object file:
>>     No such file or directory (ignored)
>>                       Package: Open MPI
>>     mockbuild at x86-041.build.eng.bos.redhat.com
>>     <mailto:mockbuild at x86-041.build.eng.bos.redhat.com>
>>                                Distribution
>>                      Open MPI: 3.0.2
>>        Open MPI repo revision: v3.0.2
>>         Open MPI release date: Jun 01, 2018
>>                      Open RTE: 3.0.2
>>        Open RTE repo revision: v3.0.2
>>         Open RTE release date: Jun 01, 2018
>>                          OPAL: 3.0.2
>>            OPAL repo revision: v3.0.2
>>             OPAL release date: Jun 01, 2018
>>                       MPI API: 3.1.0
>>                  Ident string: 3.0.2
>>                        Prefix: /usr/lib64/openmpi3
>>       Configured architecture: x86_64-unknown-linux-gnu
>>                Configure host: x86-041.build.eng.bos.redhat.com
>>     <http://x86-041.build.eng.bos.redhat.com>
>>                 Configured by: mockbuild
>>                 Configured on: Wed Jun 13 14:18:03 EDT 2018
>>                Configure host: x86-041.build.eng.bos.redhat.com
>>     <http://x86-041.build.eng.bos.redhat.com>
>>        Configure command line: '--prefix=/usr/lib64/openmpi3'
>>      '--mandir=/usr/share/man/openmpi3-x86_64'
>>      '--includedir=/usr/include/openmpi3-x86_64'
>>      '--sysconfdir=/etc/openmpi3-x86_64'
>>                                '--disable-silent-rules'
>>     '--enable-builtin-atomics'
>>                                '--enable-mpi-cxx' '--with-sge'
>>     '--with-valgrind'
>>                                '--enable-memchecker'
>>     '--with-hwloc=/usr' 'CC=gcc'
>>                                'CXX=g++' 'LDFLAGS=-Wl,-z,relro '
>>     'CFLAGS= -O2 -g
>>                                -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
>>     -fexceptions
>>                                -fstack-protector-strong
>>     --param=ssp-buffer-size=4
>>                                -grecord-gcc-switches   -m64
>>     -mtune=generic'
>>                                'CXXFLAGS= -O2 -g -pipe -Wall
>>                                -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>                                -fstack-protector-strong
>>     --param=ssp-buffer-size=4
>>                                -grecord-gcc-switches   -m64
>>     -mtune=generic'
>>                                'FC=gfortran' 'FCFLAGS= -O2 -g -pipe -Wall
>>                                -Wp,-D_FORTIFY_SOURCE=2 -fexceptions
>>                                -fstack-protector-strong
>>     --param=ssp-buffer-size=4
>>                                -grecord-gcc-switches   -m64
>>     -mtune=generic'
>>                      Built by: mockbuild
>>                      Built on: Wed Jun 13 14:25:02 EDT 2018
>>                    Built host: x86-041.build.eng.bos.redhat.com
>>     <http://x86-041.build.eng.bos.redhat.com>
>>                    C bindings: yes
>>                  C++ bindings: yes
>>                   Fort mpif.h: yes (all)
>>                  Fort use mpi: yes (limited: overloading)
>>             Fort use mpi size: deprecated-ompi-info-value
>>              Fort use mpi_f08: no
>>       Fort mpi_f08 compliance: The mpi_f08 module was not built
>>        Fort mpi_f08 subarrays: no
>>                 Java bindings: no
>>        Wrapper compiler rpath: runpath
>>                    C compiler: gcc
>>           C compiler absolute: /usr/bin/gcc
>>        C compiler family name: GNU
>>            C compiler version: 4.8.5
>>                  C++ compiler: g++
>>         C++ compiler absolute: /usr/bin/g++
>>                 Fort compiler: gfortran
>>             Fort compiler abs: /usr/bin/gfortran
>>               Fort ignore TKR: no
>>         Fort 08 assumed shape: no
>>            Fort optional args: no
>>                Fort INTERFACE: yes
>>          Fort ISO_FORTRAN_ENV: yes
>>             Fort STORAGE_SIZE: no
>>            Fort BIND(C) (all): no
>>            Fort ISO_C_BINDING: yes
>>       Fort SUBROUTINE BIND(C): no
>>             Fort TYPE,BIND(C): no
>>       Fort T,BIND(C,name="a"): no
>>                  Fort PRIVATE: no
>>                Fort PROTECTED: no
>>                 Fort ABSTRACT: no
>>             Fort ASYNCHRONOUS: no
>>                Fort PROCEDURE: no
>>               Fort USE...ONLY: no
>>                 Fort C_FUNLOC: no
>>       Fort f08 using wrappers: no
>>               Fort MPI_SIZEOF: no
>>                   C profiling: yes
>>                 C++ profiling: yes
>>         Fort mpif.h profiling: yes
>>        Fort use mpi profiling: yes
>>         Fort use mpi_f08 prof: no
>>                C++ exceptions: no
>>                Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL
>>     support: yes,
>>                                OMPI progress: no, ORTE progress: yes,
>>     Event lib:
>>                                yes)
>>                 Sparse Groups: no
>>        Internal debug support: no
>>        MPI interface warnings: yes
>>           MPI parameter check: runtime
>>     Memory profiling support: no
>>     Memory debugging support: no
>>                    dl support: yes
>>         Heterogeneous support: no
>>       mpirun default --prefix: no
>>               MPI I/O support: yes
>>             MPI_WTIME support: native
>>           Symbol vis. support: yes
>>         Host topology support: yes
>>                MPI extensions: affinity, cuda
>>         FT Checkpoint support: no (checkpoint thread: no)
>>         C/R Enabled Debugging: no
>>        MPI_MAX_PROCESSOR_NAME: 256
>>          MPI_MAX_ERROR_STRING: 256
>>           MPI_MAX_OBJECT_NAME: 64
>>              MPI_MAX_INFO_KEY: 36
>>              MPI_MAX_INFO_VAL: 256
>>             MPI_MAX_PORT_NAME: 1024
>>        MPI_MAX_DATAREP_STRING: 128
>>                 MCA allocator: basic (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                 MCA allocator: bucket (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                 MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA btl: openib (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA btl: self (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA btl: tcp (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA btl: vader (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                  MCA compress: bzip (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                  MCA compress: gzip (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA crs: none (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                        MCA dl: dlopen (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                     MCA event: libevent2022 (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA hwloc: external (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                        MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                        MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>               MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>               MCA installdirs: config (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                MCA memchecker: valgrind (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                    MCA memory: patcher (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA mpool: hugepage (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                   MCA patcher: overwrite (MCA v2.1.0, API v1.0.0,
>>     Component
>>                                v3.0.2)
>>                      MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA pmix: flux (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA pmix: isolated (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA pstat: linux (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                    MCA rcache: grdma (MCA v2.1.0, API v3.3.0,
>>     Component v3.0.2)
>>                     MCA shmem: mmap (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA shmem: posix (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA shmem: sysv (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA timer: linux (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA dfs: app (MCA v2.1.0, API v1.0.0, Component
>>     v3.0.2)
>>                       MCA dfs: orted (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                       MCA dfs: test (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                    MCA errmgr: default_app (MCA v2.1.0, API v3.0.0,
>>     Component
>>                                v3.0.2)
>>                    MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0,
>>     Component
>>                                v3.0.2)
>>                    MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0,
>>     Component
>>                                v3.0.2)
>>                    MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0,
>>     Component
>>                                v3.0.2)
>>                    MCA errmgr: dvm (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA ess: env (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA ess: singleton (MCA v2.1.0, API v3.0.0,
>>     Component
>>                                v3.0.2)
>>                       MCA ess: slurm (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA ess: tool (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                     MCA filem: raw (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                   MCA grpcomm: direct (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA iof: tool (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA iof: orted (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                  MCA notifier: syslog (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                      MCA odls: default (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA oob: ud (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA plm: isolated (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA plm: slurm (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA ras: gridengine (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                       MCA ras: simulator (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                       MCA ras: slurm (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA rmaps: mindist (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                     MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA rmaps: resilient (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA rml: oob (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                    MCA routed: binomial (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                    MCA routed: debruijn (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                    MCA routed: direct (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                    MCA routed: radix (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA rtc: hwloc (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                    MCA schizo: flux (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                    MCA schizo: ompi (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                    MCA schizo: orte (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                    MCA schizo: slurm (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                     MCA state: app (MCA v2.1.0, API v1.0.0, Component
>>     v3.0.2)
>>                     MCA state: dvm (MCA v2.1.0, API v1.0.0, Component
>>     v3.0.2)
>>                     MCA state: hnp (MCA v2.1.0, API v1.0.0, Component
>>     v3.0.2)
>>                     MCA state: novm (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                     MCA state: orted (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                     MCA state: tool (MCA v2.1.0, API v1.0.0,
>>     Component v3.0.2)
>>                       MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                      MCA coll: basic (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA coll: inter (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA coll: libnbc (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA coll: self (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA coll: sm (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                      MCA coll: sync (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA coll: tuned (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                      MCA fbtl: posix (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA fcoll: individual (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                     MCA fcoll: static (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                     MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                        MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                        MCA io: ompio (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                        MCA io: romio314 (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA mtl: psm2 (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                       MCA osc: pt2pt (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA osc: rdma (MCA v2.1.0, API v3.0.0,
>>     Component v3.0.2)
>>                       MCA osc: sm (MCA v2.1.0, API v3.0.0, Component
>>     v3.0.2)
>>                       MCA pml: v (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA pml: cm (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA pml: monitoring (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                       MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                       MCA rte: orte (MCA v2.1.0, API v2.0.0,
>>     Component v3.0.2)
>>                  MCA sharedfp: individual (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                  MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>                  MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component
>>     v3.0.2)
>>                      MCA topo: treematch (MCA v2.1.0, API v2.2.0,
>>     Component
>>                                v3.0.2)
>>                      MCA topo: basic (MCA v2.1.0, API v2.2.0,
>>     Component v3.0.2)
>>                 MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0,
>>     Component
>>                                v3.0.2)
>>
>>
>>     Quoting John Hearns <hearnsj at googlemail.com
>>     <mailto:hearnsj at googlemail.com>>:
>>
>>     > Hello Faraz.  Please start by running this command ompi_info
>>     >
>>     > On Tue, 30 Apr 2019 at 15:15, Faraz Hussain
>>     <info at feacluster.com <mailto:info at feacluster.com>> wrote:
>>     >
>>     >> I installed RedHat 7.5 on two machines with the following
>>     Mellanox cards:
>>     >>
>>     >> 87:00.0 Network controller: Mellanox Technologies MT27520 Family
>>     >> [ConnectX-3 Pro
>>     >>
>>     >> I followed the steps outlined here to verify RDMA is working:
>>     >>
>>     >>
>>     >>
>>     https://community.mellanox.com/s/article/howto-enable-perftest-package-for-upstream-kernel
>>     >>
>>     >> However, I cannot seem to get Open MPI 3.0.2 to work. When I
>>     run it, I
>>     >> get this error:
>>     >>
>>     >>
>>     --------------------------------------------------------------------------
>>     >>
>>     >> No OpenFabrics connection schemes reported that they were able
>>     to be
>>     >>
>>     >> used on a specific port. As such, the openib BTL (OpenFabrics
>>     >>
>>     >> support) will be disabled for this port.
>>     >>
>>     >>
>>     >>   Local host:      lustwzb34
>>     >>
>>     >>   Local device:     mlx4_0
>>     >>
>>     >>   Local port:      1
>>     >>
>>     >>   CPCs attempted:    rdmacm, udcm
>>     >>
>>     >>
>>     --------------------------------------------------------------------------
>>     >>
>>     >> Then it just hangs till I press control C.
>>     >>
>>     >> I understand this may be an issue with RedHat, Open MPI or
>>     Mellanox.
>>     >> Any ideas to debug which place it could be?
>>     >>
>>     >> Thanks!
>>     >>
>>     >> _______________________________________________
>>     >> Beowulf mailing list, Beowulf at beowulf.org
>>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>>     >> To change your subscription (digest mode or unsubscribe) visit
>>     >> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>     >>
>>
>>
>>
>>     _______________________________________________
>>     Beowulf mailing list, Beowulf at beowulf.org
>>     <mailto:Beowulf at beowulf.org> sponsored by Penguin Computing
>>     To change your subscription (digest mode or unsubscribe) visit
>>     https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>>
>>
>> _______________________________________________
>> Beowulf mailing list,Beowulf at beowulf.org  sponsored by Penguin Computing
>> To change your subscription (digest mode or unsubscribe) visithttps://beowulf.org/cgi-bin/mailman/listinfo/beowulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190430/0f6beda9/attachment-0001.html>


More information about the Beowulf mailing list