[Beowulf] Bioinformatics Benchmark System v3 rc1 ready

Mon Jul 12 05:55:44 PDT 2004

Hi Jan:

On Mon, 2004-07-12 at 06:27, Jan-Frode Myklebust wrote:
> It failed to run because of perl-dependencies on my ppc970 running
> RHEL 3 AS...  Instead of fixing the problem (it's not directly on the
> internet, so getting modules from CPAN is a problem), I just picked
> the benchmark-setup from the scripts and ran them by hand.  It would
> be nice if you could just skip the fancy xml input/output, and thereby
> reduce the complexity of it.. I mean, shouldn't it be fairly easy to 
> write the whole bbs in plain bourne shell ?  

On the Perl modules, could you let me know which ones were not in the
RHEL 3 AS on PPC?  I included a method to handle missing modules, though
I don't know what this particular version of RHEL requires.  

As for the fancy XML input and output, one of the intended uses of this
tool is for automatic generation of benchmark tests and results for
complex distributed systems in conjunction with other tools under
development, which might not always have a bourne shell available. 
Additionally, the input system is meant to be as flexible and simple as
possible, so we try to avoid being "fancy" :)

In terms of a rewrite in a plain bourne shell, we are currently doing a
number of things that would be quite hard to do in a shell script
(though it is possible to have the tool generate bourne shell scripts
for small portions of the run).  Around the 3.5 version time frame, we
will be doing things that cannot be done in a bourne shell script,
related to distributed benchmark measurements.

> The blast benchmarks seems to require quite a bit of memory..  The
> test-systems we've gotten only have 1 GB memory, and that seems to be
> too little to run the blast benchmark without hitting swap, or
> fighting buffer cache. So, it would be nice if the memory requirement
> were reduced..

The bbsv1 benchmarks?  These are the original from the previous
version.  The baselines are (eventually) supposed to cover the small
through huge cases.  

What I didn't mention in the docs was database formatting.  You can use
formatdb -v N  and set N large enough such that the indices do not
overflow buffer cache.  One of the uses of this tool is to figure out
where this size might be, by examining the time of BLAST execution
versus database volume size.  I will try to work up an example of this
in the next few days and get this onto the page.

> Also, it would be great if you could collect the results from the
> bbs-runs on different platforms and publish them on a webpage, a'la
> spec.org. 

I am trying to set up a reasonable set of baseline runs, and we were
seeking input on what they should have.  You indicate a need for small
memory footprint jobs, which we will be working on.  Once the baseline
is firmed up, we will enable a publication mechanism (via the XML
output).  Sort of spec like, but hopefully with more meaning for the
informatics folks :)

Thanks, your comments and suggestions are quite helpful!

Joe

> 
> 
>   -jf
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615