Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] hang-up of HPC Challenge

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mikhail Kuzminsky kus at free.net
Mon Aug 18 11:20:16 PDT 2008


I ran a set of HPC Challenge benchmarks on ONE dual socket quad-core 
Opteron2350 (Rev. B3) based server (8 logical CPUs).
RAM size is 16 Gbytes. The tests performed were under SuSE 
10.3/x86-64, for LAM MPI 7.1.4 and MPICH 1.2.7 from SuSE distribution, 
using Atlas 3.9. Unfortunately there is only one such cluster node, 
and I can't reproduce the run on another node :-(

For N (matrix size) up to 10000 all looks OK. But for more large N 
(15000/20000/...) hpcc execution 
(mpirun -np 8 hpcc) leads to Linux hang-up.

In the "top" output I see 8 hpcc examplars each eating about 100% of 
CPU, and reasonable amounts of virtual and RSS memory per hpcc 
process, and the absense of swap using. Usually there is no PTRANS 
results in hpccoutf.txt results file, but in a few cases (when I 
"activelly looked" to hpcc execution by means of ps/top issuing) I see 
reasonable PTRANS results but absense of HPLinpack results. One time I 
obtained PTRANS, HPL and DGEMM results for N=20000, but hangup later - 
on STREAM tests. May be it's simple because of absense (at hangup) of 
final writing of output buffer to output file on HDD.

One of possible reasons of hang-ups is memory hardware problem, but 
what is about possible software reasons of hangups ? 

The hpcc executable is 64-bit dynamically linked. 
/etc/security/limits.conf is empty. stacksize limit (for user issuing 
mpirun) is "unlimited", main memory limit - about 14 GB, virtual 
memory limit - about 30 GB. Atlas was compiled for 32-bit integers, 
but it's enough for such N values. Even /proc/sys/kernel/shmmax is 
2^63-1.

What else may be the reason of hangup ?

Mikhail Kuzminskiy
Computer Assistance to Chemical Research Center
Zelinsky Institute of Organic Chemistry
Moscow
  

  

  



More information about the Beowulf mailing list