[Beowulf] hang-up of HPC Challenge
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Mikhail Kuzminsky kus at free.netWed Aug 20 10:52:51 PDT 2008
- Previous message: [Beowulf] hang-up of HPC Challenge
- Next message: [Beowulf] hang-up of HPC Challenge
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In message from Greg Lindahl <lindahl at pbm.com> (Tue, 19 Aug 2008 19:39:38 -0700): >On Wed, Aug 20, 2008 at 03:45:43AM +0400, Mikhail Kuzminsky wrote: >> For some localization of possible problem reason, I ran pure HPL >>test >> instead of HPCC. HPL performs direct output to screen instead of >>writing >> to the file. >> >> Using MPICH w/np=8 I obtained normal HPL result for N=35000 - >>including >> 3 "PASSED" strings for ||Ax-b|| calculations. BUT ! Linux hang-ups >> immediately after output of this strings. > >Well, what did your configuration file tell HPL to do? Does it have >another test, perhaps a bigger one, or is it supposed to exit? We >aren't mind-readers. Pls sorry: I performed now 2 HPL run cases for the same N=10000, (1st) - "single" HPL run, i.e. ONE N=10000, ONE blocksize value, and ONE any other HPL.dat parameter. (2nd) - "multiple" HPL run w/same (one) N=10000 and blocksize=100, but with a sets of PFACTS etc (see the output below). 1st run finished successfully, 2nd lead to Linux hang-up. Yours Mikhail "single" HPL run : HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK ============================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 10000 NB : 100 PMAP : Row-major process mapping P : 2 Q : 4 PFACT : Right NBMIN : 4 NDIV : 2 RFACT : Crout BCAST : 1ringM DEPTH : 1 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 16 double precision words ---------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR11C2R4 10000 100 2 4 23.32 2.859e+01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0767386 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0181586 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0040588 ...... PASSED ============================================================================ Finished 1 tests with the following results: 1 tests completed and passed residual checks, 0 tests completed and failed residual checks, 0 tests skipped because of illegal input values. ---------------------------------------------------------------------------- End of Tests. ============================================================================ [1]+ Done mpirun -np 8 xhpl "multiple" HPL run: HPLinpack 1.0a -- High-Performance Linpack benchmark -- January 20, 2004 Written by A. Petitet and R. Clint Whaley, Innovative Computing Labs., UTK ============================================================================ An explanation of the input/output parameters follows: T/V : Wall time / encoded variant. N : The order of the coefficient matrix A. NB : The partitioning blocking factor. P : The number of process rows. Q : The number of process columns. Time : Time in seconds to solve the linear system. Gflops : Rate of execution for solving the linear system. The following parameter values will be used: N : 10000 NB : 100 PMAP : Row-major process mapping P : 2 Q : 4 PFACT : Left Crout Right NBMIN : 2 4 NDIV : 2 RFACT : Left Crout Right BCAST : 1ring DEPTH : 0 SWAP : Mix (threshold = 64) L1 : transposed form U : transposed form EQUIL : yes ALIGN : 16 double precision words ---------------------------------------------------------------------------- - The matrix A is randomly generated for each test. - The following scaled residual checks will be computed: 1) ||Ax-b||_oo / ( eps * ||A||_1 * N ) 2) ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) 3) ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) - The relative machine precision (eps) is taken to be 1.110223e-16 - Computational tests pass if scaled residuals are less than 16.0 ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L2 10000 100 2 4 23.02 2.897e+01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0980967 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0232126 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0051885 ...... PASSED ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2L4 10000 100 2 4 22.97 2.903e+01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0832258 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0196937 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0044019 ...... PASSED ============================================================================ T/V N NB P Q Time Gflops ---------------------------------------------------------------------------- WR00L2C2 10000 100 2 4 22.95 2.905e+01 ---------------------------------------------------------------------------- ||Ax-b||_oo / ( eps * ||A||_1 * N ) = 0.0980967 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_1 * ||x||_1 ) = 0.0232126 ...... PASSED ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0051885 ...... PASSED ... and here Linux hangs ... > >-- greg > > >_______________________________________________ >Beowulf mailing list, Beowulf at beowulf.org >To change your subscription (digest mode or unsubscribe) visit >http://www.beowulf.org/mailman/listinfo/beowulf
- Previous message: [Beowulf] hang-up of HPC Challenge
- Next message: [Beowulf] hang-up of HPC Challenge
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
