[Beowulf] slow mpi init/finalize
Michael Di Domenico
mdidomenico4 at gmail.com
Tue Oct 17 06:51:43 PDT 2017
On Tue, Oct 17, 2017 at 8:54 AM, Peter Kjellström <cap at nsc.liu.se> wrote:
>> however, your test above fails on my machines
>> user at n1# ib_acme -d n3
>> service: localhost
>> destination: n3
>> ib_acm_resolve_ip failed: cannot assign requested address
>> return status 0x0
> Did this fail instantly or with the typical ~1m timeout?
it fails instantly.
> If you have IntelMPI also try what I suggested and use the ucm dapl.
> For example for the first port on an mlx4 hca that's "ofa-v2-mlx4_0-1u".
> You can make sure that it comes first in your dat.conf (/etc/rmda
> or /etc/infiniband) or pass it explicitly to IntelMPI:
> I_MPI_DAPL_PROVIDER=ofa-v2-mlx4_0-1u mpiexec.hydra ...
> You may want to set I_MPI_DEBUG=4 or so to see what it does.
i'll give this a whirl today hopefully
More information about the Beowulf