[Beowulf] 512 nodes Myrinet cluster Challanges

Wed May 3 06:11:48 PDT 2006

Hi Patrick

Patrick Geoffray wrote:

[...]

> I find it amusing that you have previously expressed reservation about 
> the Linpack benchmark used in the Top500 but you blindely trust the HPCC 
> results. A benchmark is useful only if it's widely used and if it is 
> properly implemented.

I am not taking Vincent's side here, simply making some points.

Benchmarks are effectively meaningless unless they use the code you are 
interested in with data and run conditions that you are using or plan to 
use.  You may be able to abstract kernels and test cases (ala Linpack 
and many others), but how well they correlate against actual observable 
performance is rarely known (the kernel is not the only thing to run).

My point being that I am not particularly happy with either Linpack, or 
HPCC.  They do absolutely nothing to help people understand the 
performance of BLAST/mpiBLAST, HMMer, Amber, ... on clusters. *

[...]

> The key is to set the right requirements in your RFP. Naive RFPs would
> use broken benchmarks like HPCC. Smart RFPs would require benchmarking
> real application cores under reliability and performance constraints.

... other RFPs use federal procurement benchmarks that have very little 
to do with the actual machines they intend to purchase.

Benchmarking is a measurement.  For it to be a meaningful measurement it 
has to have at least some minimal relevance and predictive ability 
relative to end users codes and cases.  Simply throwing together an RFP 
with a randomly chosen (and not terribly relevant benchmark), or 
occasionally wishful thinking does not a realistic or good computing 
system make.

> It's not that "you get what you pay for", it's "you get what you ask for
> at the best price".

Yup, something very close to this is usually the case.  We more often 
see spec non-compliant proposals win the day.  Usually from the volume 
box stackers/shippers who pay little to no attention to the spec.  Nor 
do they run the benchmarks.  Not gonna name any names, they know who 
they are.

[...]

> Ok, you want to buy a 1024 nodes cluster. How do you measure at full
> system load ? You ask to benchmark another 1024 nodes cluster ? You
> can't, no vendor has a such a cluster ready for evaluation. Even if they
> had one, things change so quickly in HPC, it would be obsolete very
> quickly from a sale point of view.

We have had a number of RFPs over the last year requesting that we run 
on the proposed system.  It is quite hard to get the RFP writers to 
understand that very few companies either have a system of that size or 
specification, or would be willing to let it be used for the month 
required to do the runs and nothing but the runs.

Benchmark facilities are cost centers.  No vendor makes money by running 
benchmarks.  You would be lucky to have 32-128 nodes to rub together in 
most cases.  Expecting a full on 1024 node cluster of any flavor in the 
configuration you want is, well ...

> The only way is to benchmark something smaller (256 nodes) and define
> performance requirements at 1024 nodes. If the winning bid does not
> match the acceptance criteria, you refuse the machine or you negociate a
> "punitive package".

Heh... These need to be realistic as well.

[...]

> Not often in HPC. The HPC market is so small and so low-volume, you
> cannot take the risk to alienate customers like that, they won't come
> back. If they don't come back, you run out of business.

Hmmm... I disagree with the statement about he HPC market being so 
small.  Contact me offline if you want to discuss it further.

* this is why we created Bioinformatics Benchmark System 
(http://www.scalableinformatics.com/bbs), specifically to enable people 
to easily use/create their own benchmark test cases.  We are finding 
people prefer to run our example cases, so the onus is upon us to make 
them more relevant and current.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615