Parallel BLAST - help
ting at fai.fujitsu.com
Tue Apr 16 15:08:59 PDT 2002
I have three nodes Beowulf cluster MPI environment up and running now.
And download the FASTA from NCBI on the master node.
I successful wrote a code to break the data,
but unfortunately I could not have the runable code to get the
data back from the nodes to the host(master). :-(
Can anyone give me some suggestion or web site that I
can have the runable code to use? It would help me a lot.
Thank you very much.
From: Steve Gaudet
Sent: Monday, April 15, 2002 11:12 AM
To: 'William R. Pearson'; beowulf at beowulf.org
Subject: RE: Parallel BLAST
> -----Original Message-----
> From: William R. Pearson
> Sent: Sunday, April 14, 2002 10:32 PM
> To: beowulf at beowulf.org
> Subject: Parallel BLAST
> > Why is it that BLAST is not available for MPI/PVM? I would think
> > clusters would be the prefect host for such an application.
> > Is it there is no need because BLAST is already so fast and
> > no one wants to break the database out onto node-resident disks?
> > Or is it that BLAST is kept running on single processor or
> shared memory
> > machines BLAST so that the DB is always in memory ready to
> roll without
> > loading and doing the same for a cluster is not worth it
> > because the same trick is difficult to do on a node given
> the current
> > way clusters are built? I assume the same is true for FASTA?
> I suspect that BLAST is not available for MPI/PVM because (1) it is
> too fast, and (2) there is not much demand for it.
> 95% of the time, BLAST is almost an in-memory grep (the other 5% of
> the time it is working on the things it is looking for). Sequence
> comparison is embarrassingly parallel, and very easily threaded.
> Distributing the sequence databases and collecting results has more
> overhead (there probably aren't many distributed grep programs
> either). FASTA is 5 - 10X slower than BLAST, and Smith-Waterman is
> another 5-20X slower than FASTA. Here, the communications overhead is
> low, and distributed systems work OK for FASTA, and great for
> Smith-Waterman (where the overhead fraction is very small).
> Of course, it is a lot easier to compile a threaded program, and just
> run it, than it is to install and configure the MPI or PVM environment
> and the programs to run in it. Bioinformatics software is often run
> by computer savvy biologists, not high-performance computing folks,
> and not having to install and configure PVM/MPI is a big advantage.
> The NCBI probably does not make a PVM/MPI parallel BLAST because there
> is very little demand for it, and it does not meet their computational
There's also a commerical version from Turbogenomics.
1) Ready to go, plug-n-play solution for parallel BLAST
2) Expertise and 20+ years of experience in parallel computing
3) Dynamic database splitting feature to take advantage of computers that
have less memory than the size of the database
4) Smart load balancing - achieve linear to superlinear speedup
5) No modification made to the NCBI BLAST algorithm to ensure identical
results with the non-parallel version
6) Easy drop-in update whenever NCBI releases newer versions of their
7) Excellent support
8) 30-days money back guarantee
Linux Solutions Engineer
| Turbotek Computer Corp. tel:603-666-3062 ext. 21 |
| 8025 South Willow St. fax:603-666-4519 |
| Building 2, Unit 105 toll free:800-573-5393 |
| Manchester, NH 03103 e-mail:sgaudet at turbotekcomputer.com |
| web: http://www.turbotekcomputer.com |
Beowulf mailing list, Beowulf at beowulf.org
To change your subscription (digest mode or unsubscribe) visit
More information about the Beowulf