Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: HPL input file

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Mark Hahn hahn at mcmaster.ca
Mon Feb 26 07:39:34 PST 2007


> I am trying to find out the speed of my cluster using HPL but I am not able
> to understand what values to set in HPL.dat to find out the peak perfomance
> (e.g. the values of N, NB, PxQ, etc). Kindly help me in this regard.

following is the HPL.dat I'm currently using as a load-generator
for my cluster's 8GB dual-socket-single-core nodes.  it's not for 
generating HPL scores, but rather just to stress the system.

comments:
 	- you choose the problem size to match your memory - too low a value
 	will result in not enough work per cpu and lower efficiency.  on my
 	system, I found no significant advantage to using more than 1GB/proc,
 	but that should depend on the CPU and interconnect speed.  (faster
 	cpus will need more work to amortize communication; faster communication
 	will lower the amount of work to amortize.)

 	- I didn't find any strong dependence on NB.

 	- P*Q=ncpus; for a switched interconnect, conventional wisdom is that
 	you want PxQ to be close to square.  on my machine (full-bisection
 	quadrics with dual-processor nodes) I think I've measured it being
 	slightly faster when run in a 1:2 shape (Q ~= 2P).

 	- I haven't found any strong performance dependency on any of the
 	other parameters, but other clusters may be different if they have
 	slower or non-flat networks, more procs/node, etc.

regards, mark hahn.

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
5            # of problems sizes (N)
1000 31700 31700 31700 31700
1            # of NBs
200         NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1       Ps
2       Qs
16.0         threshold
1            # of panel fact
1            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4           NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64          swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)



More information about the Beowulf mailing list