[Beowulf] Re: HPL input file

Mark Hahn hahn at mcmaster.ca
Mon Feb 26 07:39:34 PST 2007


> I am trying to find out the speed of my cluster using HPL but I am not able
> to understand what values to set in HPL.dat to find out the peak perfomance
> (e.g. the values of N, NB, PxQ, etc). Kindly help me in this regard.

following is the HPL.dat I'm currently using as a load-generator
for my cluster's 8GB dual-socket-single-core nodes.  it's not for 
generating HPL scores, but rather just to stress the system.

comments:
 	- you choose the problem size to match your memory - too low a value
 	will result in not enough work per cpu and lower efficiency.  on my
 	system, I found no significant advantage to using more than 1GB/proc,
 	but that should depend on the CPU and interconnect speed.  (faster
 	cpus will need more work to amortize communication; faster communication
 	will lower the amount of work to amortize.)

 	- I didn't find any strong dependence on NB.

 	- P*Q=ncpus; for a switched interconnect, conventional wisdom is that
 	you want PxQ to be close to square.  on my machine (full-bisection
 	quadrics with dual-processor nodes) I think I've measured it being
 	slightly faster when run in a 1:2 shape (Q ~= 2P).

 	- I haven't found any strong performance dependency on any of the
 	other parameters, but other clusters may be different if they have
 	slower or non-flat networks, more procs/node, etc.

regards, mark hahn.

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
5            # of problems sizes (N)
1000 31700 31700 31700 31700
1            # of NBs
200         NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1       Ps
2       Qs
16.0         threshold
1            # of panel fact
1            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4           NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64          swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)



More information about the Beowulf mailing list