Diskless for Bioinformatics?

Tim Carlson tim.carlson at pnl.gov
Fri Jun 22 14:34:14 PDT 2001


On Fri, 22 Jun 2001, William Pearson wrote:

If you are going to use off the shelf NCBI BLAST, then your runs are
trivially parallel and will scale that way.

Surprisingly, you don't gain any speed going with a P4 for doing Blast
runs. P3s do just nicely. Our dual P3/800 performs at about 60% the speed
of our dual P4/1.7 (similarly equipped with RAM and Disk) when blasting
against the NR database.

You just need to think about how you are going to get the datbases out to
the nodes. This should be a fraction of the compute time since we assume
you are going to do a *bunch* of blasts against each db.

I haven't looked into the commercial versions of BLAST, but if you are
going to be doing a lot of that stuff.. it may not hurt to look given that
you have enough money :)

> > Just a little genetics research firm, needing some serious horsepower to
> > start running big hammer and blast jobs.  The data we have now is just
> > the
> > bare minimum we need to get by, but if we had things like a working
> > beowulf
> > the scientists upstairs would start making, since they'd be able to use
> > it,
> > much more data.
>
> I think you want disks - they make it easier to debug a node separately,
> and for your BLAST applications (which will not run in parallel, you
> must run many separate instances) you can have all 16-32-64 CPU's
> loading up the database independently.
>


Tim Carlson
Voice: (509) 375-5978
Email: Tim.Carlson at pnl.gov





More information about the Beowulf mailing list