[Beowulf] OpenMPI over libfabric (was Re: Top 5 reasons why mailing lists are better than Twitter)

Matt Wallis mattw at madmonks.org
Tue Nov 22 07:31:40 UTC 2022



> On 22 Nov 2022, at 06:16, Christopher Samuel <chris at csamuel.org> wrote:
> 
> On 11/21/22 4:39 am, Scott Atchley wrote:
> 
>> We have OpenMPI running on Frontier with libfabric. We are using HPE's CXI (Cray eXascale Interface) provider instead of RoCE though.
> 
> Yeah I'm curious to know if Matt's issues are about OpenMPI->libfabric or libfabric->RoCE ?
> 
> FWIW we're using Cray's MPICH over libfabric (also over CXI), the ABI portability of MPICH is really useful to us as it allows us to patch containers used via Shifter to replace their MPI libraries with the Cray ones and have their code use the HSN natively.

At the moment I’m stuck trying to pinpoint this myself.

We’re using Intel E810 NICs. Low level RDMA seems to be working, iperf gives the expected performance, however for MPI, these apparently need PSM3.
For MPI performance I’ve been running the OSU Microbenchmarks, in particular, osu_bw.

I’ve had osu_bw working over tcp, about 1.8GB/sec. So I know it was working.

PSM3 support comes from libfabric at this time, OpenMPI itself seems to top out at PSM2. So in the interest of not installing the entire OneAPI stack, I thought I would just rebuild OpenMPI to use libfabric, and libfabric to support PSM3.

Used spack to get it done, initial result after the first build was a series of errors from mpirun telling me that the PSM3 module could not open the VLAN interface that’s being used for this. While not ideal, this suggests that my compilation worked. Pinged Intel, they believe it should work, but ask me to upgrade to the latest ice driver.

Upgrade to the latest ice driver, and now there’s nothing. Every mpirun hangs indefinitely. There’s no orted on the remote node, nothing. I’ve left it for 30+ minutes. No errors, no time outs nothing.

Intel gave me some environment variables to set, nothing. It’s like the module is no longer being loaded.

Discussions with Intel, realise there’s a bunch of MLNX libraries still floating around, set about purging all of those. Nothing, no change.

I’ve tried stracing, it sits with a single poll, no runaway loop, just poll something and poll forever, previous entries are just the usual library look ups and it seems to find what it needs.

I’ve installed the Intel Fabric Suite which comes with its own OpenMPI. Same result. 

I’m about to rebuild it and the ice driver, I’m just confused at how it went from PSM3 complaining about an interface to nothing at all. LDD shows all the correct libraries are being found, lsmod shows the correct modules in the kernel.

Matt.   


More information about the Beowulf mailing list