Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] HPL as a learning experience

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Carsten Aulbert carsten.aulbert at aei.mpg.de
Tue Mar 16 08:27:30 PDT 2010


Hi all,

I wanted to run high performance linpack mostly for fun (and of course to 
learn more about it and stress test a couple of machines). However, so far 
I've had very mixed results.

I downloaded the 2.0 version released in September 2008 and managed it to 
compile with mpich 1.2.7 on Debian Lenny. The resulting xhpl file is 
dynamically linked like this:

        linux-vdso.so.1 =>  (0x00007fffca372000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x00007fb47bca8000)
        librt.so.1 => /lib/librt.so.1 (0x00007fb47ba9f000)
        libgfortran.so.3 => /usr/lib/libgfortran.so.3 (0x00007fb47b7c4000)
        libm.so.6 => /lib/libm.so.6 (0x00007fb47b541000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00007fb47b32a000)
        libc.so.6 => /lib/libc.so.6 (0x00007fb47afd7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb47bec4000)

Then I wanted to run a couple of tests on a single quad-CPU node (with 12 GB 
physical RAM), I used

http://www.advancedclustering.com/faq/how-do-i-tune-my-hpldat-file.html

to generate files for a single and a dual core test [1] and [2].

Starting the single core run does not pose any problem:
/usr/bin/mpirun.mpich -np 1  -machinefile machines /nfs/xhpl

where machines is just a simple file containing 4 times the name of this host. 
So far so good. 
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
WR11C2R4       14592   128     1     1             407.94          5.078e+00
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.0087653 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0209927 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0045327 ...... PASSED
============================================================================

When starting the two core run, I receive the following error message after a 
couple of seconds (after RSS hits the VIRT RAM value in top):

/usr/bin/mpirun.mpich -np 2  -machinefile machines /nfs/xhpl
p0_20535:  p4_error: interrupt SIGSEGV: 11
rm_l_1_20540: (1.804688) net_send: could not write to fd=5, errno = 32

SIGSEGV with p4_error indicates a seg fault within hpl - that's as far as I've 
come with google, but right now I have no idea how to proceed. I somehow doubt 
that this venerable program is so buggy that I'd hit it on my first day ;)

Any ideas where I might do something wrong?

Cheers

Carsten

[1]
single core test
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any) 
8            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
14592         Ns
1            # of NBs
128           NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1            Ps
1            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0                               Number of additional problem sizes for PTRANS
1200 10000 30000                values of N
0                               number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64        values of NB

[2]
dual core setup
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any) 
8            device out (6=stdout,7=stderr,file)
1            # of problems sizes (N)
14592         Ns
1            # of NBs
128           NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1            Ps
2            Qs
16.0         threshold
1            # of panel fact
2            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4            NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64           swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
##### This line (no. 32) is ignored (it serves as a separator). ######
0                               Number of additional problem sizes for PTRANS
1200 10000 30000                values of N
0                               number of additional blocking sizes for PTRANS
40 9 8 13 13 20 16 32 64        values of NB



More information about the Beowulf mailing list