[Beowulf] Re: Performance degrading

Gus Correa gus at ldeo.columbia.edu
Mon Dec 21 10:06:04 PST 2009


Hi Jorg

To clarify what is going on,
I would try the cpi.c (comes with MPICH2), or the "connectivity_c.c",
"ring_c.c" (come with OpenMPI) programs.
Get them with the source code in the MPICH2 and OpenMPI sites.
These programs are in the "examples" directories.
Compiliation is straightforward with mpicc.
Run these programs on one node (4 processes) first,
then on several nodes (say -np 8, -np 12, etc).
Remember OpenMPI mpiexec has the "-byslot" and "-bynode" options that
allow you to experiment with different process vs.
core/node configurations (see "man mpiexec").

If cpi.c runs, then it should not be an OpenMPI problem,
or with your network, etc.
This will narrow your investigation also.

I know nothing about your programs, but I find strange that
it starts more processes than you request.
As noted by Glen this used to be the case with the old
MPICH1, which you are not using.
Hence, your program seems to be doing stuff under the hood,
and beyond mpiexec.

In any case, have you tried "mpiexec -np 3 newchem" (if 3 is an
acceptable number for newchem)?

Also, not being the administrator doesn't prevent you from installing
a newer OpenMPI  (current version is 1.4)
from source code in your area and using it.
You just need to set the right PATH, LD_LIBRARY_PATH, and MANPATH
to your own OpenMPI on your .bashrc/.cshrc file.

I am no expert, just a user, but here is what I think may be happening.
Oversubscribing processors/cores leads to context switching
across the processes, which is a killer for MPI performance.
Oversubscribing memory (e.g. total of all user processes memory
above 80% or so), leads to memory paging, another performance killer.
I would guess both situations open plenty of opportunity for
gridlocks, one process trying to communicate with another that is
on hold, and when the other that is on hold becomes active,
the one that was trying to talk goes on hold, and so on.
Sometimes the programs just hang, sometimes one MPI process goes astray 
losing communication with the others.
Something like this may be happening to you.
I think the message is: MPI and oversubscription
(of processors or memory) don't mix well.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------



Jörg Saßmannshausen wrote:
> Hi guys,
> 
> ok, some more information. 
> I am using OpenMPI-1.2.8 and I only start 4 processes per node. So my hostfile 
> looks like that:
> comp12 slots=4
> comp18 slots=4
> comp08 slots=4
> 
> And yes, one process is the idle one which does things in the background. I 
> have observed similar degradions before with a different program (GAMESS) 
> where in the end, running a job on one node was _faster_ then running it on 
> more than one nodes. Clearly, there is a problem here.
> 
> Interesting to note that the fith process is consuming memory as well, I did 
> not see that at the time when I posted it. That is somehow odd as well, as a 
> different calculation (same program) does not show that behaviour. I assume 
> it is one extra process per job-group which will act as a master or shepherd 
> for the slave processes. I know that GAMESS (which does not use MPI but ddi) 
> has one additional process as data-server.
> 
> IIRC, the extra process does come from NWChem, but I doubt I am 
> oversubscribing the node as it usually should not do much, as mentioned 
> before. 
> 
> I am still wondering whether that could be a network issue?
> 
> Thanks for your comments!
> 
> All the best
> 
> Jorg
> 
> 
> On Wednesday 16 December 2009 04:42:59 beowulf-request at beowulf.org wrote:
>> Hi Glen, Jorg
>>
>> Glen: Yes, you are right about MPICH1/P4 starting extra processes.
>> However, I wonder if that is what is happening to Jorg,
>> of if what he reported is just plain CPU oversubscription.
>>
>> Jorg:  Do you use MPICH1/P4?
>> How many processes did you launch on a single node, four or five?
>>
>> Glen:  Out of curiosity, I dug out the MPICH1/P4 I still have on an
>> old system, compiled and ran "cpi.c".
>> Indeed there are extra processes there, besides the ones that
>> I intentionally started in the mpirun command line.
>> When I launch two processes on a two-single-core-CPU machine,
>> I also get two (not only one) extra processes, in a total of four.
>>
>> However, as you mentioned,
>> the extra processes do not seem to use any significant CPU.
>> Top shows the two actual processes close to 100% and the
>> extra ones close to zero.
>> Furthermore, the extra processes don't use any
>> significant memory either.
>>
>> Anyway, in Jorg's case all processes consumed about
>> the same (low) amount of CPU, but ~15% memory each,
>> and there were 5 processes (only one "extra"?, is it one per CPU socket?
>> is it one per core? one per node?).
>> Hence, I would guess Jorg's context is different.
>> But ... who knows ... only Jorg can clarify.
>>
>> These extra processes seem to be related to the
>> mechanism used by MPICH1/P4 to launch MPI programs.
>> They don't seem to appear in recent OpenMPI or MPICH2,
>> which have other launching mechanisms.
>> Hence my guess that Jorg had an oversubscription problem.
>>
>> Considering that MPICH1/P4 is old, no longer maintained,
>> and seems to cause more distress than joy in current kernels,
>> I would not recommend it to Jorg or to anybody anyway.
>>
>> Thank you,
>> Gus Correa
>> ---------------------------------------------------------------------
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> ---------------------------------------------------------------------
> 




More information about the Beowulf mailing list