[Beowulf] problem of mpich-1.2.7p1

Gus Correa gus at ldeo.columbia.edu
Wed Feb 3 11:02:21 PST 2010


Hi Christian

The program attachment didn't come again.
You may try to cut and paste the program to the bottom of the message.

Now I see, you are worried about MPI performance,
not the program correctness at this point.

If your program does too little work,
it is likely that the initialization/finalization
and the whole MPI setup and communication
take more time than the actual computation.
If this is the case, and particularly if your
network is slow (say Ethernet 100), you will see better performance
for less nodes when the "problem size is small".

There is nothing wrong with this.
This phenomenon, and several variants of it, are
called "Amdahl's Law":

http://en.wikipedia.org/wiki/Amdahl's_law

In general the "problem size" is controlled by one or a few numbers
on your code or on your parameter files.
Problem size may be controlled by, say,
the size of an array or matrix,
the number of iterations of a main loop, etc.
Could you perhaps increase the problem size on your code,
say boost it up 10 or 100 times,
and see if the performance in many nodes still beats one node alone?

I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


christian suhendra wrote:
> oh...the content of /mirror/mpich-1.2.7p1/share/
> machines.LINUX are the hostname of each node..
> here is the content:
> cluster1
> cluster2
> cluster3
> cluster4
> 
> when i ran the canon on 1 node i got the total time is 4.316000 msecs
> but when i ran canon in 4 node.
> see:
> mpirun -np 4 the total is  21.552000 msecs
> 
> it takes a long time then i node..it supposed to be more faster then 1 
> node/PC..
> i this case i juzt need the mpich or my program work in all of node so 
> that the total time would be more faster then run in 1 node..
> 
> i attached my program so you could investigated the problem, but i 
> thougt the real problem is on the configuration..
> 
> 
> thank you so much mr. gus...
> i really need your help i don't know how to solve this problem even my 
> lecturer on my university doesn't know how to solve this..actually this 
> is my final project for my thesis..
> and i take this because i wants to be an expert on this field sometimes..
> 
> 
> regards
> christian




More information about the Beowulf mailing list