Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] [gorelsky@stanford.edu: CCL:dual-core Opteron 275performance]

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mikhail Kuzminsky kus at free.net
Wed Jul 13 09:31:48 PDT 2005


In message from Alan Louis Scheinine <scheinin at crs4.it> (Tue, 12 Jul 
2005 12:24:27 +0200):
>  1) Gerry Creager wrote "Hoowa!"
>     Since the results seem useful, I would like to add the 
>following.
>     On dual-CPU boards with Athlon32 CPUs, the program "bolam" was 
>slow if
>     both CPUs on the board were used, it was better to have one 
>MPICH process
>     per compute node.  This problem did not appear in another 
>cluster that had
>     Opteron dual-CPU boards (single-core), that is, two processes 
>for each node
>     did not cause a slowdown.  This is an indication that "bolam" is 
>at a
>     threshold for memory access being a bottleneck. 
The original post by S.Gorelsky (re-sent by E.Leitl) was about good
scalability of 4cores/dual-CPUs Opteron 275 server on Gaussian 03 
DFT/test397 test. I'm testing just now like Supermicro server 
w/2*Opteron 275 but w/DDR333 instead of DDR400 used by S.Gorelsky.
I used SuSE 9.0 w/2.4.21 kernel.

I understood, that original results of S.Gorelsky were obtained, 
probably,
for shared memory parallelization ! If I use G03 w/Linda (which
is main parallelization tool for G03 - parallelization in shared
memory model of G03 is available only for more restricted subset
of quantum-chemical methods) - then the results are much more bad.

On 4 cores I obtained speedup only 2.95 for Linda vs 3.6 for
shared memory. The difference is, as I understand, simple because
of data exchanges through RAM for the case of Linda; in shared memory
model like memory traffic is absent.
FYI: speedup by S.Gorelsky for 4 CPUs is 3.4 (hope that I calculated
properly :-)).

I also obtained similar results for other quantum-chemical methods 
which show that using of Linda/G03 may give bad scalability for
dual-core Opteron. 

We also have some (developing by us) quantum-chemical application 
which
is bandwidth-limited under parallelization, and using of 1 CPU (1 MPI 
process) per dual Xeon nodes for Myrinet/MPICH is strongly preferred. 
In the case of (dual single core CPUs)-Opteron nodes the situation is 
better.

But now for 4cores/2CPUs per Opteron node to force the using of
only 2 cores (from 4), by 1 for each chip, we'll need to have
cpu affinity support in Linux.

Yours
Mikhail

> A complication 
>for this
>     interpretation is that the Athlon32 nodes use Linux kernel 
>2.4.21.
>  2) Mikhail Kuzminsky asked "do you have "node interleave memory" 
>switched off?
>     Reading the BIOS:
>     Bank interleaving "Auto", there are two memory modules per CPU 
>so there
>        should be bank interleaving.
>     Node interleaving "Disable"
>  3) In an email Guy Coates asked
>     > Did you need to use numa-tools to specify the CPU placement, 
>or did the
>     > kernel "do the right thing" by itself?
>     The kernel did the right thing by itself.
>     I have a question: what are numa-tools?
>     On the computer I find
>     man -k numa
>        numa   (3)  - NUMA policy library
>        numactl(8)  - Control NUMA policy for processes or shared 
>memory
>     rpm -qa | grep -i numa
>        numactl-0.6.4-1.13
>     Is numactl the "numa-tools"?  Is there another package to 
>consider installing?
>     I see that numactl has many "man" pages.
>
>Reference, previous message:
> >In all cases, 4 MPI processes on a machine with 4 cores (two 
>dual-core CPUs).
> >Meteorology program 1, "bolam"    CPU time, real time (in seconds)
> >      Linux kernel 2.6.9-11.ELsmp     122        128
> >      Linux kernel 2.6.12.2            64         77
> >
> >Meteorology program 2, "non-hydrostatic"
> >      Linux kernel 2.6.9-11.ELsmp     598        544
> >      Linux kernel 2.6.12.2           430        476
>
>
>-- 
>
>  Centro di Ricerca, Sviluppo e Studi Superiori in Sardegna
>  Center for Advanced Studies, Research, and Development in Sardinia
>
>  Postal Address:               |  Physical Address for FedEx, UPS, 
>DHL:
>  ---------------               | 
> -------------------------------------
>  Alan Scheinine                |  Alan Scheinine
>  c/o CRS4                      |  c/o CRS4
>  C.P. n. 25                    |  Loc. Pixina Manna Edificio 1
>  09010 Pula (Cagliari), Italy  |  09010 Pula (Cagliari), Italy
>
>  Email: scheinin at crs4.it
>
>  Phone: 070 9250 238  [+39 070 9250 238]
>  Fax:   070 9250 216 or 220  [+39 070 9250 216 or +39 070 9250 220]
>  Operator at reception: 070 9250 1  [+39 070 9250 1]
>  Mobile phone: 347 7990472  [+39 347 7990472]
>




More information about the Beowulf mailing list