[Beowulf] MPI + IB question

Jörg Saßmannshausen j.sassmannshausen at ucl.ac.uk
Thu Nov 15 02:26:14 PST 2012


Dear all,

I am a little bit confused about a problem I have encountered a few times now.

I have 3 clusters which have InfiniBand network. One of the older clusters has 
Mellanox MT23108 cards and a Voltaire sLB-24 switch, the newer cluster has 
Mellanox MT26428 with a QLogic 12300 switch. All clusters are running Debian 
Squeeze, all of them are 64 bit machines and all of them have the the required 
packages for the IB network installed. I have tested the IB network and that 
is up and running without any problems, as far as I can tell. Most of the 
programs I am using are running well with the IB network, however, I got two 
which are behaving a bit odd. I will give only one example. I do not expect a 
solution for the specific problem here but I would like to understand what is 
going on here.

If I am compiling the latest version of GAMESS-US with the MPI network, it is 
running fine if I am starting it like that in the rungms wrapper script:

/opt/openmpi/gfortran/1.4.3/bin/mpirun -np 4 --hostfile 
/home/sassy/gamess/mpi/host /home/sassy/build/gamess/gamess.01.x

However, as OpenMPI is 'clever', I wanted to make sure that I am really using 
the IB network and not the gigabit network here. Thus, I added the flag to 
ignore the TCP network and started the program like that:
 
/opt/openmpi/gfortran/1.4.3/bin/mpirun -np 4 --hostfile 
/home/sassy/gamess/mpi/host --mca btl ^tcp 
/home/sassy/build/gamess/gamess.01.x

That crashes immediately, and I have included the verbose output of that in 
the attached file.

So far, so good. However, if I am not using the cluster with the Voltair 
switch (described above) but the one with the more recent Qlogic switch and 
_copy_ the binary just over, it is working. 
There is no crash when I am using the IB network and the program is running.
My question is: why? I have thought, that MPI is an interface and anything 
which has to do with the node-to-node communication is handled by MPI, so the 
program GAMESS-US is just making its calls to MPI and OpenMPI is then handling 
the communication, regardless of the network. So if there is a TCP network 
around, OpenMPI is using that and if there is a IB network around, OpenMPI is 
using that. 
However, from the above observation (and I got a very similar case wit NWchem) 
it appears to me that the program GAMESS-US has problems with the Voltair 
network but no problems with the Qlogic network . That is something I find a 
bit puzzling. 

As I said, I am not after a specific solution for that particular problem here 
I really would like to understand why one IB network is working and the _same_ 
binary on a different network is failing. Recompiling GAMESS-US on the failing 
network does not help here, I get the same problems. 

All the best from a foggy London

Jörg
-- 
*************************************************************
Jörg Saßmannshausen
University College London
Department of Chemistry
Gordon Street
London
WC1H 0AJ 

email: j.sassmannshausen at ucl.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
-------------- next part --------------
/opt/openmpi/gfortran/1.4.3/bin/mpirun -np 4 --hostfile /home/sassy/gamess/mpi/host --mca btl ^tcp --mca btl_openib_verbose 100 --mca orte_base_help_aggregate 0 --mca btl_base_verbose 30 /home/sassy/build/gamess/gamess.01.x
[node24:05105] mca: base: components_open: Looking for btl components
[node24:05106] mca: base: components_open: Looking for btl components
[node24:05106] mca: base: components_open: opening btl components
[node24:05106] mca: base: components_open: found loaded component self
[node24:05106] mca: base: components_open: component self has no register function
[node24:05105] mca: base: components_open: opening btl components
[node24:05106] mca: base: components_open: component self open function successful
[node24:05106] mca: base: components_open: found loaded component sm
[node24:05106] mca: base: components_open: component sm has no register function
[node24:05105] mca: base: components_open: found loaded component self
[node24:05105] mca: base: components_open: component self has no register function
[node24:05106] mca: base: components_open: component sm open function successful
[node24:05105] mca: base: components_open: component self open function successful
[node24:05105] mca: base: components_open: found loaded component sm
[node24:05105] mca: base: components_open: component sm has no register function
[node24:05105] mca: base: components_open: component sm open function successful
[node32:32503] mca: base: components_open: Looking for btl components
[node32:32504] mca: base: components_open: Looking for btl components
[node32:32503] mca: base: components_open: opening btl components
[node32:32503] mca: base: components_open: found loaded component self
[node32:32503] mca: base: components_open: component self has no register function
[node32:32503] mca: base: components_open: component self open function successful
[node32:32504] mca: base: components_open: opening btl components
[node32:32504] mca: base: components_open: found loaded component self
[node32:32504] mca: base: components_open: component self has no register function
[node32:32503] mca: base: components_open: found loaded component sm
[node32:32503] mca: base: components_open: component sm has no register function
[node32:32504] mca: base: components_open: component self open function successful
[node32:32504] mca: base: components_open: found loaded component sm
[node32:32504] mca: base: components_open: component sm has no register function
[node32:32503] mca: base: components_open: component sm open function successful
[node32:32504] mca: base: components_open: component sm open function successful
[node24:05106] select: initializing btl component self
[node24:05106] select: init of component self returned success
[node24:05105] select: initializing btl component self
[node24:05105] select: init of component self returned success
[node24:05105] select: initializing btl component sm
[node24:05105] select: init of component sm returned success
[node24:05106] select: initializing btl component sm
[node24:05106] select: init of component sm returned success
[node32:32504] select: initializing btl component self
[node32:32503] select: initializing btl component self
[node32:32503] select: init of component self returned success
[node32:32503] select: initializing btl component sm
[node32:32503] select: init of component sm returned success
[node32:32504] select: init of component self returned success
[node32:32504] select: initializing btl component sm
[node32:32504] select: init of component sm returned success
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[5187,1],1]) is on host: node24
  Process 2 ([[5187,1],0]) is on host: node32
  BTLs attempted: self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[5187,1],2]) is on host: node32
  Process 2 ([[5187,1],1]) is on host: node24
  BTLs attempted: self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[5187,1],3]) is on host: node24
  Process 2 ([[5187,1],0]) is on host: node32
  BTLs attempted: self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
At least one pair of MPI processes are unable to reach each other for
MPI communications.  This means that no Open MPI device has indicated
that it can be used to communicate between these processes.  This is
an error; Open MPI requires that all MPI processes be able to reach
each other.  This error can sometimes be the result of forgetting to
specify the "self" BTL.

  Process 1 ([[5187,1],0]) is on host: node32
  Process 2 ([[5187,1],1]) is on host: node24
  BTLs attempted: self sm

Your MPI job is now going to abort; sorry.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** Your MPI job will now abort.
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[node24:5105] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
[node32:32504] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[node24:5106] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
--------------------------------------------------------------------------
*** The MPI_Init_thread() function was called before MPI_INIT was invoked.
*** This is disallowed by the MPI standard.
*** Your MPI job will now abort.
[node32:32503] Abort before MPI_INIT completed successfully; not able to guarantee that all other processes were killed!
--------------------------------------------------------------------------
mpirun has exited due to process rank 2 with PID 32504 on
node node32 exiting without calling "finalize". This may
have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
unset echo


More information about the Beowulf mailing list