Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

Parallel BLAST

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Steve Gaudet SGaudet at turbotekcomputer.com
Mon Apr 15 11:11:55 PDT 2002


> -----Original Message-----
> From: William R. Pearson [mailto:wrp at alpha0.bioch.virginia.edu]
> Sent: Sunday, April 14, 2002 10:32 PM
> To: beowulf at beowulf.org
> Subject: Parallel BLAST
> 
> 
>   
> > Why is it that BLAST is not available for MPI/PVM?  I would think
> > clusters would be the prefect host for such an application.
> > Is it there is no need because BLAST is already so fast and
> > no one wants to break the database out onto node-resident disks?
> > Or is it that BLAST is kept running on single processor or 
> shared memory 
> > machines BLAST so that the DB is always in memory ready to 
> roll without
> > loading and doing the same for a cluster is not worth it
> > because the same trick is difficult to do on a node given 
> the current
> > way clusters are built?  I assume the same is true for FASTA?
> 
> I suspect that BLAST is not available for MPI/PVM because (1) it is
> too fast, and (2) there is not much demand for it.  
> 
> 95% of the time, BLAST is almost an in-memory grep (the other 5% of
> the time it is working on the things it is looking for).  Sequence
> comparison is embarrassingly parallel, and very easily threaded.
> Distributing the sequence databases and collecting results has more
> overhead (there probably aren't many distributed grep programs
> either).  FASTA is 5 - 10X slower than BLAST, and Smith-Waterman is
> another 5-20X slower than FASTA.  Here, the communications overhead is
> low, and distributed systems work OK for FASTA, and great for
> Smith-Waterman (where the overhead fraction is very small).
> 
> Of course, it is a lot easier to compile a threaded program, and just
> run it, than it is to install and configure the MPI or PVM environment
> and the programs to run in it.  Bioinformatics software is often run
> by computer savvy biologists, not high-performance computing folks,
> and not having to install and configure PVM/MPI is a big advantage.
> The NCBI probably does not make a PVM/MPI parallel BLAST because there
> is very little demand for it, and it does not meet their computational
> needs.
--------------

There's also a commerical version from Turbogenomics.

http://www.turbogenomics.com

Offering:

1) Ready to go, plug-n-play solution for parallel BLAST
2) Expertise and 20+ years of experience in parallel computing
3) Dynamic database splitting feature to take advantage of computers that
have less memory than the size of the database
4) Smart load balancing - achieve linear to superlinear speedup
5) No modification made to the NCBI BLAST algorithm to ensure identical
results with the non-parallel version
6) Easy drop-in update whenever NCBI releases newer versions of their
algorithm
7) Excellent support
8) 30-days money back guarantee


Cheers,


Steve Gaudet 
Linux Solutions Engineer
   ..... 
  <(©¿©)> 
 
===================================================================
| Turbotek Computer Corp.    tel:603-666-3062 ext. 21             |
| 8025 South Willow St.      fax:603-666-4519                     |
| Building 2, Unit 105       toll free:800-573-5393               |
| Manchester, NH 03103       e-mail:sgaudet at turbotekcomputer.com  |
|                            web: http://www.turbotekcomputer.com |
===================================================================

  



More information about the Beowulf mailing list