[Beowulf] MPI_Open_port: difficult questions (I think)
jimlasc at gmail.com
Wed Aug 10 13:04:54 PDT 2005
I have the following:
A cluster with n nodes.
Each node can speak with the left and right neigbour. (k can speak with k-1
They form a ring.
There is a lot of synchronisation-information flowing through the ring,
Some of the synchronisation is done using IRecv's I posted at the beginning
of the program
(because I never know in advance if there will be a synchronisation type A
msg or a type B msg, or no msg at all...)
I made a communicator comm_all which is (when no nodes joined) in the
beginning equal to MPI_COMM_WORLD,
but the new nodes (which join after I started up,see below) should allso
become a "member" of this communicator
Now I want to add nodes.
With adding a node I mean the following:
connecting a computer which is unknown at the time of startup (one I just
bought, for example) to the ring, and allowing him (the new node) to speak
with his neighbour-nodes.
(1)How should I implement that (see below...)?
(2)when I use MPI_Spawn, I can't "say" that it has to be spawned on the new
node, because MPI decides itself where to spawn,is this correct?
(3)So I should use MPI_Open_port on a "master-node" and connect the new node
with the master-node, correct?
And, MPI_comm_accept is blocking, so if I want the new node to be able to
connect on every moment,
(4)I should use a thread solemny for the MPI_Comm_accept, is this correct?
(5) when I use MPI_Intercomm_merge, is there a way to say that I want the
nodes 0-n to keep their rank, and that I want the new node to have rank n+1
because (see above) I posted a lot of IRecv's at the startup-phase (and, the
IRecv's are reposted once they are filled),
so I prefer only having to change the IRecv's from node 0 and n instead of
all the IRecv's
(and, this gives less problems for messages which are between sender and
(6) When a nodenumber changes, and a message is between sender and receiver,
I can consider the message as lost, correct?
When finished I want it to be totally decentralised, so that the new node
can connect with a node of his choise.
(7)This means I should open a port on every node from the "start-group",
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Beowulf