PVM and MOSIX

Coetmeur, Alain alain.coetmeur at icdc.caissedesdepots.fr
Thu Nov 2 01:48:07 PST 2000


maybe the best solution is to 
use PVM as usual, so it spawn one process
per processor on each machine.
mosix will then do it's best to balance load
if nodes are used by others, or is PVMPov
daemon have different workload.

what you remark is also what I've noted
with a parallel computation program
based on plain RPC distribution through TCP/IP.

mosix is good at balancing what is quite already 
roughly balanced...
my idea is that it is better in managing
a real world shared virtual computer, and not
a benchmark.

look also at mosrun option (there is a random node option)
so that pvmdaemon are migrated at first time...
the best is really to distribute process
with remote exec, but force initial  migration and allow 
further migration is maybe enough.

another point is that you need to have enough swap space
for all the process, and maybe a good part of real RAM,
yer mosix will try to balance RAM usage.

contacte the mosix list for more info.


-----Message d'origine-----
De: Andreas Boklund [mailto:andreas at amy.udd.htu.se]
Date: mercredi 1 novembre 2000 09:18
À: Kiran
Cc: beowulf at beowulf.org
Objet: Re: PVM and MOSIX



I have a cluster that use both PVM and MOSIX.
And i used it to run PVMPov.

If MOSIX is much faster i would recommend that you to make sure that
PVMPov really utilizes both machines. Otherwise it could be a
configuration error.


Alternative explanation:

To see what was best i invoked several PVMPov processes, ither on one node
each OR all on the master node and left it to MOSIX to distribute them.

What i noticed was that on 2 nodes the time was almost identical at about
70  seconds. 

Here is a shorter version of my chart:
(read of a graph and rounded, but i think you get the general idea)


Processes 	 Seconds for
		PVM	MOSIX
2		70	70
4		15	35
6		12	40
8		10	45
10		5	45
12		<5	45
14		<5	50
16		<5	55
18		<5	65
20		<5	60

As you notice the PVM times decreased with the number of nodes, while the
MOSIX timing first went down and then up again. The reason for this is
that The computer has to become "overloaded" before MOSIX starts to
migrate processes and when you start many processes the master node get so
overloaded that it cant serve the processes that it has migrated with new
figures to crunch(well it moves them back fills em up and then sends them
away again). Therefor the increase in time with a higher number of
processes. We did find that if we used a larger image then skyvase, the
initial down dip spanned over more processors. This was because the extra
time is in the initial stage before the system has stabilized(all
processes have been migrated).
 

As i see it there shouldnt be any possibility that MOSIX would distribute 
the PVMPov processes better then PVMpov alone. 


Good luck
//Andreas Boklund


*********************************************************
*  Administator of Amy(studentserver) and Sfinx(Iris23) *
*                                                       *
*   Voice: 070-7294401                                  *
*   ICQ: 12030399                                       *
*   Email: andreas at shtu.htu.se, boklund at linux.nu        *
*                                                       *
*   That is how you find me, How do -I- find you ?      *
*********************************************************



_______________________________________________
Beowulf mailing list
Beowulf at beowulf.org
http://www.beowulf.org/mailman/listinfo/beowulf




More information about the Beowulf mailing list