[Beowulf] evaluating FLOPS capacity of our cluster

Mon May 11 15:58:24 PDT 2009

Greg Lindahl wrote:
> On Mon, May 11, 2009 at 05:56:43PM -0400, Gus Correa wrote:
> 
>> However, here is somebody that did an experiment with increasing
>> values of N, and his results suggest that performance increases  
>> logarithmically with problem size (N), not linearly,
>> saturating when you get closer to the maximum possible for your
>> current memory size.
> 
> This is well-known. And of course faster interconnect means that you
> get closer to peak at smaller problem sizes.
 >
 > -- greg
 >

Hi Greg, list

What you said is right on my point and question:

How far/close I am now to the maximum performance that can be achieved?

Memory is 16GB/node, maximum possible is 128GB/node.
24 nodes with Infiniband III, single switch.
Max problem size w/ current memory: N=196,000,
max problem size if I had 128GB/node: N=554,000.
Current Rmax/Rpeak=83.6%, and I've read the comment that anything
below 85% is not good enough (for a "small cluster" I presume).

So, is 85% feasible?
Is 85% the top?

Imagining the nodes had 128GB, N=554,000,
what is your guess for Rmax/Rpeak?
(YMMV is not an answer! :) )

Many thanks,
Gus Correa

PSs - The problem with this HPL thing is that it becomes addictive,
and I need to go do some work, production, not tests, tests, tests ...

---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------