[Beowulf] MorphMPI based on fortran itf

Ashley Pittman ashley at quadrics.com
Wed Oct 12 05:11:09 PDT 2005


On Wed, 2005-10-12 at 13:36 +0200, Toon Knapen wrote:
> Ashley Pittman wrote:
> 
> > The second problem is that of linking, most MPI vendors already have
> > MPI_Init in their own library, having another library with it's own
> > wrapper MPI_Init in it is going to lead to a whole world of pain to do
> > with dynamic linking and symbol resolution.  This is not something that
> > has ever been done before to the best of my knowledge and there is a
> > very good reason for that.
> 
> Right but the header file of the MorphMPI library could define
> MorphMPI_Init for instance to avoid this. Additionally it could generate
> the necessary macro's (*if* the user requests this) to automatically
> convert all it's calls to MPI_Init to MorphMPI_Init. Of course one
> should be sure that this is done in the whole application or not at all
> but there are easy ways to verify that.

Yes it could, I don't understand what you mean about "if the user
requests this", if it's a ABI then there is no possibility of
prepossessing macros based on the users request.

I'm not denying that it's possible, indeed it's been mentioned before on
this list more than once.  Doing it though is non-trivial in c and doing
it in fortran is just wrong.

> > Thirdly is the performance issue, any MPI vendor worth his salt tries
> > very hard to reduce the number of function calls and library's between
> > the application and the network, adding another one is a step in the
> > wrong direction.  This may not matter so much for ethernet clusters but
> > certainly for some people the software stack accounts for a surprising
> > percentage of "network" latency.
> 
> Do you really think one extra function call would make a difference to
> the level of being unacceptable? If that is the case MPI libraries would
> only be available as archives instead of dynamic libraries because a
> call to a dynamic library also costs an extra dereference.

You are comparing apples and oranges, one pass through function has a
greater cost than the extra deference that using shared library's give
you.  Even if you could make every passthough function simply cast it's
args and call the underlying library you are still running code from a
different shared library and hence different physical pages which is
going to evict useful stuff from the icache.  In reality it's much
harder than this because you will also need to to convert types, suppose
the ABI decides to use 32 bit opaque values for the communicator and the
underlying library uses 64 bit, you then need a hash table to convert
from one to the other each and every time you make any MPI call, not to
mention hash management, doing the right thing in COMM_FREE and making
sure the whole thing is thread safe...  What if the user passed in a
communicator handle morphMPI doesn't recognise, does it then report and
error or pass on garbage to the real MPI, if the former how do you
generate the error code, if the latter how do you pick the garbage to
send?

What if the ABI provides a function, for example the new comm_spawn
function but the underlying MPI layer doesn't?  What error code does
morphMPI pass back to the application?

And yes people do seem to care about the extra dereference that shared
library's give you, a lot of people do actually insist on statically
linking whenever possible.  Note that in the future this assertion might
not hold true, shared library are going to be just as fast once
prelinking is common.

Ashley,



More information about the Beowulf mailing list