[Beowulf] Puzzling Intel mpi behavior with slurm
skylar.thompson at gmail.com
Thu Apr 5 17:11:27 PDT 2018
At least for Grid Engine/OpenMPI the preferred mechanism ("tight
integration") involves the shepherds running on each exec hosts to start
MPI, without any SSH/RSH required at all. I'm not sure if you've run across
this documentation, but it might help to figure out what's going on:
I'm guessing you're using the "srun" method right now.
On Thu, Apr 5, 2018 at 8:10 AM, Faraz Hussain <info at feacluster.com> wrote:
> Here's something quite baffling. I have a cluster running slurm but have
> not setup passwordless ssh for a user yet. So when the user runs "mpirun -n
> 2 -hostfile hosts hostname", it will hang because of ssh issue. That is
> Now the baffling thing is the mpirun command works inside a slurm script!
> How can it work if passwordless ssh has not been configured? Does slurm use
> some different authentication (munge?) to login to the hosts and execute
> the hostname command?
> Or does slurm have some fancy behind the scenes integration with Intel mpi
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf