[Beowulf] [landman at scalableinformatics.com: Re: [Bioclusters] topbiocluster.org]

Fri Jun 24 03:36:59 PDT 2005

----- Forwarded message from Joe Landman <landman at scalableinformatics.com> -----

From: Joe Landman <landman at scalableinformatics.com>
Date: Thu, 23 Jun 2005 21:20:58 -0400
To: "Clustering,  compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
Subject: Re: [Bioclusters] topbiocluster.org
User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317)
Reply-To: "Clustering,  compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>

I had sent a note to James offline.  Worth posting a similar letter here.

James Cuff wrote:
>Ok,
>
>So I put my money where my mouth is, (well 50 bucks anyway)
>
>http://topbiocluster.org is alive
>
>(well once the DNS gets pushed everywhere that is, I only set it last
>night, some of you may have to hang fire for a bit :-))
>
>We all talk a lot on this list about which cluster this, that and the
>other for application this that and the rest.  I also saw the last top500
>list yesterday, and to be frank I'm all done with linpak, we do other
>stuff, and it matters.

Absolutely.  Linpack makes sense for folks doing large sparse matrix 
work.  Few folks here are doing that.  Moreover, IDC data seems to 
indicate that a sizeable fraction of compute cycles goes to this sort of 
computing.  Yet there is no equivalent to HPL for this community.

We started to do this with baseline tests for BBSv3 and we had hoped for 
some feedback.  Most of what we got was from vendors wanting to use it 
for marketing purposes (ok, but it needs to mean something, so the 
content needs to be relevant for a large swath).

>There are two good benchmark tools I know of, both are currently listed on
>the topbiocluster.org 'site', but I'm going to need a bit of help from
>folk to actually get this thing off the ground.
>
>My first thoughts are we build a list of what is actually out there in
>terms of bioclusters, bit like Glen's QA mail from the other day, then we
>start to go about doing the benchmark gig.

In looking over the reported results, I was struck with the thought that 
 someone is designing/building/selling slow filesystems.  Moreover, as 
this community is fundamentally data motion bound (huge databases and 
data sets take non-zero time to move, even on fast links).

>I'm also looking to the vendors a bit here (I know some of you folk hang
>out in here :-)).  Let me know off list if I'm opening up a can of worms,
>or if you would like to help.  I want to keep this open, but there are
>often things best talked about off list...

I believe this is a can of worms that needed to be opened some years 
ago.  We tried to pry this open with BBS v3 baseline tests last year, 
and get some feedback.  Since then we have added Amber8 tests, GAMESS 
tests, and a few others.  But we are missing some critical tests.

Bonnie is nice but it doesn't replicate the workload of most of the 
tools we have encountered.  Most of the tools we have seen have use 
cases that are either large sequential reads punctuated by occasional 
writes, or effectively random IO.  Other tools introduce effectively 
random latency of network traffic (remote interactions with web service 
systems)

>If we get this thing right it _will_ be a one stop shop for biocluster
>performance.  
>
>I really want to capture NFS/SAN/storage figures in here, we all know it's
>not just about the number of CPUs.  We really need to see if we can
>capture the whole *cluster* performance, not just raw CPU horsepower...

Thank you.  The fastest CPUs can be hamstrung by terrible IO or simply 
poor/non-scalable cluster IO designs.  The fastest nets can be hamstrung 
by poor quality switches/NICs.  Bad OS choices make huge performance 
impacts, as do many other bits along those lines.  The wrong compilers 
or compiler options can make fast codes creep.

>
>So, let's open this up, and lets get talking...  
>
>- How can we best start to fill in this web site?

First off, get end users to start talking about the things that they 
care about in performance:  what bottlenecks their runs?  With enough 
data, we can move to step 2.

Second, build tests that exercize the weak spots as well as the strong 
spots.  Sure, the latest greatest multi-core CPUs are great.  Just don't 
run them with a single spindle, or a poorly designed RAID5.

>
>- Would people be happy to submit figures about their cluster?

Hopefully.  BBSv4 is aimed at making this very simple.

>
>- What numbers shall we use for ranking?  What to run etc.

One fundamental error made in the Spec numbers is reducing the 
multidimensional performance space to single numbers using a dubious 
practice of creating an average (over things with very different 
characteristics/dimensions).  I would argue for a vector, and the vector 
would be per application.  That is have a blast vector, with blastx, 
psiblast, rpsblast,... .  Have a HMMer vector with an hmmalign, pfam 
search,...  .  Have a data transfer vector:  time to copy nt to all 
nodes in cluster/number of nodes.  Have a web services vector.   This 
way you don't lose information (some systems may be better designed for 
one subset of tasks than another).

>
>- How do we capture storage aspects?

bonnie is a start, but I would question as to how well correlated 
against use cases it is.  I would think that a more typical use case 
would involve remote queries of a large database, moving large 
databases, local queries of large databases, etc.

>I'm happy to do some of the grunt work here to collect information etc.  
>I guess it's best that we keep all the chat open on this list, and I'll
>see what pops up.  As things come in, I'll start to flesh out the website
>soem more.  Also, once we have a bit more of a scope as to what we will
>actually rank, list and store, I'll be happpy to start on the mysql
>database, and get things rocking.  

I had an XML format for the output.  I was speaking with a few other 
folks about standardizing it with a few other tools.  This would enable 
easier construction of comparison tools.

>
>submit at topbiocluster.org will work to send things in so I can get them
>into a database if we actually get going on it.
>
>Let's see what happens, this could be a bumpy ride, but it should be fun.
>
>"Cabin crew, doors to automatic and cross check!"
>
>So I guess the floor is now open... 
>
>Best,
>
>J.

As indicated, we fully support this effort.  BBSv4 is being built as we 
speak, fixing many unanticipated and sometimes surprising "features", 
and adding functionality and consistency.  Documentation too (most 
requested feature).

Please folks, suggest some tests which stress IO and the rest of the 
system.  Specifically real workloads.   They are the only benchmarks 
that matter.

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615

_______________________________________________
Bioclusters maillist  -  Bioclusters at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07100, 11.36820            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20050624/e55330de/attachment.sig>