[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?

Faraz Hussain info at feacluster.com
Thu May 2 08:40:56 PDT 2019


Thanks. Before I go down the path of installing things willy-nilly, is  
there some guide I should be following instead? I obviously have a  
problem with my mellanox drivers combined with "user error"..

So should I be paying Mellanox to help? Or is it a RedHat issue? Or is  
it our harware vendor, HP who should be involved??

Looks like I need support on how to get support :-)


Quoting Christopher Samuel <chris at csamuel.org>:

>> root at lustwzb34:/root # systemctl status rdma
>> Unit rdma.service could not be found.
>
> You're missing this RPM then, which might explain a lot:
>
> $ rpm -qi rdma-core
> Name        : rdma-core
> Version     : 17.2
> Release     : 3.el7
> Architecture: x86_64
> Install Date: Tue 04 Dec 2018 03:58:16 PM AEDT
> Group       : Unspecified
> Size        : 107924
> License     : GPLv2 or BSD
> Signature   : RSA/SHA256, Tue 13 Nov 2018 01:45:22 AM AEDT, Key ID  
> 24c6a8a7f4a80eb5
> Source RPM  : rdma-core-17.2-3.el7.src.rpm
> Build Date  : Wed 31 Oct 2018 07:10:24 AM AEDT
> Build Host  : x86-01.bsys.centos.org
> Relocations : (not relocatable)
> Packager    : CentOS BuildSystem <http://bugs.centos.org>
> Vendor      : CentOS
> URL         : https://github.com/linux-rdma/rdma-core
> Summary     : RDMA core userspace libraries and daemons
> Description :
> RDMA core userspace infrastructure and documentation, including initscripts,
> kernel driver-specific modprobe override configs, IPoIB network scripts,
> dracut rules, and the rdma-ndd utility.
>
> -- 
>   Chris Samuel  :  http://www.csamuel.org/  :  Berkeley, CA, USA
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit  
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf





More information about the Beowulf mailing list