Sequence analysis (blast/fasta/hmmer) on Beowulfs

William Pearson wrp at virginia.edu
Sat Aug 11 11:12:44 PDT 2001


The fasta package of programs (ftp.virginia.edu/pub/fasta/fasta3.shar.Z) 
provide virtually all the programs in the fasta package (including 
SSEARCH for Smith-Waterman) under either the PVM or MPI environment, so 
I would
expect the MPI versions to work on a Scyld/Beowulf cluster.  Earlier 
versions of the programs required the sequence databases be visible to 
the worker nodes, but with the current version of the PVM/MPI parallel 
programs, only the manager/host process needs to have access to the 
databases.

We have just upgraded our Linux cluster to RedHat 7.1, and the MPI 
versions of the programs no longer work
on more than two nodes (they work on 2 nodes just fine, but with 3 or 
more, MPI does not start up properly).

Somewhere on the WWW is a reference to a parallel implementation of 
BLAST.  We worked on this several years ago, but I do not believe there 
is a generally available PVM/MPI implementation of the current BLAST 
version - parallel versions of BLAST use threads on shared memory 
machines, and there are Perl scripts that automatically send out 
individual sequences to individual machines in a cluster and collect the 
results.  This might be a bit of a challenge under Scyld, because the 
databases would have to be visible on the cluster/node machines.

There is also a pvm implementation of HMMER available from Sean Eddy's 
group, I believe.

Bill Pearson




More information about the Beowulf mailing list