[Beowulf] use a MPI library thought a shared library
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mathieu Gontier mg.mailing-list at laposte.netWed Dec 5 00:28:05 PST 2007
- Previous message: [Beowulf] use a MPI library thought a shared library
- Next message: [Beowulf] RE: Intel MPI Benchmark maintainers?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Yep, I use ldd every days. But here the problem comes from a corrupted
structure in MorphMPI and MPI
typedef struct{
int MorphMPI_SOURCE;
int MorphMPI_TAG;
int MorphMPI_ERROR;
void* mpi_status ;
} MorphMPI_Status ;
Where the attribut mpi_status is used to point a real MPI_Status. In MPICH:
typedef struct{
int MPI_SOURCE;
int MPI_TAG;
int MPI_ERROR;
int count ;
} MPI_Status ;
Then, when my MorphMPI_Status is given to MorphMPI_Get_count(), the
attribut MorphMPI_Status::mpi_status is not corrupted but
MorphMPI_Status::mpi_status::count is corrupted: the value should be 4
and not "random".
I tried to manipulate the structure MorphMPI_Status (add another integer
to align it in 64-bits, only have the void*,...) without success.
As reminder, this problem appears only when the MPI is used through a
dynamic linked MorphMPI library.
Does someone have an idea?
Mathieu Gontier
Core Development Engineer
Read the attached v-card for telephone, fax, adress
Look at our web-site http://www.fft.be
Joe Landman wrote:
> Greetings Mathieu:
>
> Mathieu Gontier wrote:
>
> [...]
>
>> So, I meet a little problem whatever the MPI library used (I tried
>> with MPICH-1.2.5.2, MPICHGM and IntelMPI).
>> When MorphMPI is linked statically with my parallel application,
>> everything is ok; but when MorphMPI is linked dynamically with my
>> parallel application, MPI_Get_count return a wrong value.
>>
>> I concluded it is difficult to use a MPI library thought a shared
>> library. I wonder if someone have more information about it (in this
>
> Not likely. I would suggest ldd. It is your friend.
>
> For example:
>
> joe at pegasus-i:~/workspace/source-mpi$ ldd matmul_mpi_3.exe
> libm.so.6 => /lib/libm.so.6 (0x00002b5409d17000)
> libmpi.so.0 => not found
> libopen-rte.so.0 => not found
> libopen-pal.so.0 => not found
> librt.so.1 => /lib/librt.so.1 (0x00002b5409f99000)
> libdl.so.2 => /lib/libdl.so.2 (0x00002b540a1a2000)
> libnsl.so.1 => /lib/libnsl.so.1 (0x00002b540a3a6000)
> libutil.so.1 => /lib/libutil.so.1 (0x00002b540a5c0000)
> libpthread.so.0 => /lib/libpthread.so.0 (0x00002b540a7c3000)
> libc.so.6 => /lib/libc.so.6 (0x00002b540a9de000)
> /lib64/ld-linux-x86-64.so.2 (0x00002b5409af9000)
>
> Notice that libmpi.so.0 is not found, so I can't run this by hand.
> Unless I force the issue using LD_LIBRARY_PATH
>
> joe at pegasus-i:~/workspace/source-mpi$ export
> LD_LIBRARY_PATH="/home/joe/local/lib64/:/home/joe/local/lib/"
> joe at pegasus-i:~/workspace/source-mpi$ ldd matmul_mpi_3.exe
> libm.so.6 => /lib/libm.so.6 (0x00002ae35ca50000)
> libmpi.so.0 => /home/joe/local/lib/libmpi.so.0
> (0x00002ae35ccd1000)
> libopen-rte.so.0 => /home/joe/local/lib/libopen-rte.so.0
> (0x00002ae35cfe8000)
> libopen-pal.so.0 => /home/joe/local/lib/libopen-pal.so.0
> (0x00002ae35d2b3000)
> librt.so.1 => /lib/librt.so.1 (0x00002ae35d514000)
> libdl.so.2 => /lib/libdl.so.2 (0x00002ae35d71d000)
> libnsl.so.1 => /lib/libnsl.so.1 (0x00002ae35d921000)
> libutil.so.1 => /lib/libutil.so.1 (0x00002ae35db3b000)
> libpthread.so.0 => /lib/libpthread.so.0 (0x00002ae35dd3e000)
> libc.so.6 => /lib/libc.so.6 (0x00002ae35df59000)
> /lib64/ld-linux-x86-64.so.2 (0x00002ae35c832000)
>
> and it might even run ...
>
> joe at pegasus-i:~/workspace/source-mpi$ ./matmul_mpi_3.exe
> D[tid=0]: running on machine = pegasus-i
> D: checking arguments: N_args=1
> D: arg[0] = ./matmul_mpi_3.exe
> Allocating memory ...
> array size in MB = 7.629 MB
> (remember, you have 2 of these)normalization a: 0.05510, b: 0.00173
> 0 : loop_min = 0, loop_max = 1000
> ...
>
> Do you have some sort of LD_LIBRARY_PATH set up? Or something set in
> /etc/ld.so.config that points to where these things are? Remember,
> mpirun/mpiexec's alternative purpose in life is to set up the correct
> run time environment for you, so you might want to see what is going
> on with the environment in your equivalent command.
>
>
- Previous message: [Beowulf] use a MPI library thought a shared library
- Next message: [Beowulf] RE: Intel MPI Benchmark maintainers?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
