[Beowulf] MPI_Open_port: difficult questions (I think)

Jim Lasc jimlasc at gmail.com
Wed Aug 10 13:04:54 PDT 2005


Hello,

I have the following:
A cluster with n nodes. 
Each node can speak with the left and right neigbour. (k can speak with k-1 
and (k+1)%n).
They form a ring.
There is a lot of synchronisation-information flowing through the ring,
Some of the synchronisation is done using IRecv's I posted at the beginning 
of the program
(because I never know in advance if there will be a synchronisation type A 
msg or a type B msg, or no msg at all...)
I made a communicator comm_all which is (when no nodes joined) in the 
beginning equal to MPI_COMM_WORLD,
but the new nodes (which join after I started up,see below) should allso 
become a "member" of this communicator

Now I want to add nodes. 
With adding a node I mean the following:
connecting a computer which is unknown at the time of startup (one I just 
bought, for example) to the ring, and allowing him (the new node) to speak
with his neighbour-nodes.

(1)How should I implement that (see below...)? 
(2)when I use MPI_Spawn, I can't "say" that it has to be spawned on the new 
node, because MPI decides itself where to spawn,is this correct?
(3)So I should use MPI_Open_port on a "master-node" and connect the new node 
with the master-node, correct?
And, MPI_comm_accept is blocking, so if I want the new node to be able to 
connect on every moment, 
(4)I should use a thread solemny for the MPI_Comm_accept, is this correct?
(5) when I use MPI_Intercomm_merge, is there a way to say that I want the 
nodes 0-n to keep their rank, and that I want the new node to have rank n+1 
?
because (see above) I posted a lot of IRecv's at the startup-phase (and, the 
IRecv's are reposted once they are filled), 
so I prefer only having to change the IRecv's from node 0 and n instead of 
all the IRecv's
(and, this gives less problems for messages which are between sender and 
receiver)
(6) When a nodenumber changes, and a message is between sender and receiver, 
I can consider the message as lost, correct?
When finished I want it to be totally decentralised, so that the new node 
can connect with a node of his choise. 
(7)This means I should open a port on every node from the "start-group", 
correct?

Jim Lascov.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050810/5a8f56f5/attachment.html>


More information about the Beowulf mailing list