Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: Performance degrading

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Jörg Saßmannshausen jorg.sassmannshausen at strath.ac.uk
Wed Dec 16 01:41:39 PST 2009


Hi guys,

ok, some more information. 
I am using OpenMPI-1.2.8 and I only start 4 processes per node. So my hostfile 
looks like that:
comp12 slots=4
comp18 slots=4
comp08 slots=4

And yes, one process is the idle one which does things in the background. I 
have observed similar degradions before with a different program (GAMESS) 
where in the end, running a job on one node was _faster_ then running it on 
more than one nodes. Clearly, there is a problem here.

Interesting to note that the fith process is consuming memory as well, I did 
not see that at the time when I posted it. That is somehow odd as well, as a 
different calculation (same program) does not show that behaviour. I assume 
it is one extra process per job-group which will act as a master or shepherd 
for the slave processes. I know that GAMESS (which does not use MPI but ddi) 
has one additional process as data-server.

IIRC, the extra process does come from NWChem, but I doubt I am 
oversubscribing the node as it usually should not do much, as mentioned 
before. 

I am still wondering whether that could be a network issue?

Thanks for your comments!

All the best

Jorg


On Wednesday 16 December 2009 04:42:59 beowulf-request at beowulf.org wrote:
> Hi Glen, Jorg
>
> Glen: Yes, you are right about MPICH1/P4 starting extra processes.
> However, I wonder if that is what is happening to Jorg,
> of if what he reported is just plain CPU oversubscription.
>
> Jorg:  Do you use MPICH1/P4?
> How many processes did you launch on a single node, four or five?
>
> Glen:  Out of curiosity, I dug out the MPICH1/P4 I still have on an
> old system, compiled and ran "cpi.c".
> Indeed there are extra processes there, besides the ones that
> I intentionally started in the mpirun command line.
> When I launch two processes on a two-single-core-CPU machine,
> I also get two (not only one) extra processes, in a total of four.
>
> However, as you mentioned,
> the extra processes do not seem to use any significant CPU.
> Top shows the two actual processes close to 100% and the
> extra ones close to zero.
> Furthermore, the extra processes don't use any
> significant memory either.
>
> Anyway, in Jorg's case all processes consumed about
> the same (low) amount of CPU, but ~15% memory each,
> and there were 5 processes (only one "extra"?, is it one per CPU socket?
> is it one per core? one per node?).
> Hence, I would guess Jorg's context is different.
> But ... who knows ... only Jorg can clarify.
>
> These extra processes seem to be related to the
> mechanism used by MPICH1/P4 to launch MPI programs.
> They don't seem to appear in recent OpenMPI or MPICH2,
> which have other launching mechanisms.
> Hence my guess that Jorg had an oversubscription problem.
>
> Considering that MPICH1/P4 is old, no longer maintained,
> and seems to cause more distress than joy in current kernels,
> I would not recommend it to Jorg or to anybody anyway.
>
> Thank you,
> Gus Correa
> ---------------------------------------------------------------------
> Gustavo Correa
> Lamont-Doherty Earth Observatory - Columbia University
> Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------

-- 
*************************************************************
Jörg Saßmannshausen
Research Fellow
University of Strathclyde
Department of Pure and Applied Chemistry
295 Cathedral St.
Glasgow
G1 1XL

email: jorg.sassmannshausen at strath.ac.uk
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html




More information about the Beowulf mailing list