[Beowulf] Re: [Bioclusters] error while running mpiblast (fwd from landman at scalableinformatics.com)
eugen at leitl.org
Wed Mar 2 02:51:11 PST 2005
----- Forwarded message from Joe Landman <landman at scalableinformatics.com> -----
From: Joe Landman <landman at scalableinformatics.com>
Date: Wed, 02 Mar 2005 00:25:05 -0500
To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
Subject: Re: [Bioclusters] error while running mpiblast
User-Agent: Mozilla Thunderbird 1.0 (X11/20041207)
Reply-To: "Clustering, compute farming & distributed computing in life science informatics" <bioclusters at bioinformatics.org>
James Cuff wrote:
>"Iam running this on SGI multiprocessor(numa)",
>you are running on a single shared (well near unified, and SGI do this
>very, very well) memory server with, as you said and appear to understand
>What on earth are you going to gain from MPI? Standard NCBI threads
>should do for you just fine, or maybe I've been smoking the funny stuff.
it is quite possible that mpiblast will scale better than NCBI blast
on this system. mpi forces you to pay attention to locality of
reference, so you tend to do a good job partitioning your code (that is,
if it scales). NCBI is built with pthreads, and I haven't seen it scale
much beyond about 10 CPUs on an SMP. The coarser grain of the mpiblast
partitioning (the pthread partitioning is very fine grain) will very
likely result in better scalability on a NUMA.
Not only that, but large multicpu NUMA's have problems with memory
hotspots. I remember in the Origin days we used to play games with
DPLACE directives and whatnot else to control memory layout, replication
of pages, etc. This was under Irix, and there were rich sets of tools
to help. I don't think many of them are available under Linux right now
(possibly in the SGI propack). You don't see much a problem in 2/4 way
systems. It becomes serious when you load data into a page, and you
start getting 16 requestors for that page. Page migration is not a win
here. readonly page replication can be a huge win here. Luckily, with
mpi, all references are local to begin with ...
That said, I don't have ready access to one, so I cannot test this
hypothesis, though I might just throw together a BBS experiment to test
this. I'd love to play with a nice 9MB cache machine. This would be a
sweet blast engine :) Expensive... yes, but running out of cache is a
"good thing" (TM).
>If you _do_ happen to have multiple NUMA's in a cluster, (1) you are very
>lucky and (2) you should the still listen to Joe's advice... Local is
>only local so far, try:
>(or as Joe maybe put better)
Lucas sent me a note indicated that in 1.3.0 they allow for shared and
local to coexist. Aaron/Lucas, if you are about, could you clarify some
of this? I don't want to lead people astray (and I will need to update
the SGE tool).
> WFM, YMMV..
Note: We have not built the mpiblast RPM for Itanium (nor for that
matter, any of our RPMs). Is there any interest in this? Curious.
>James Cuff, D. Phil.
>Group Leader, Applied Production Systems
>Broad Institute of MIT and Harvard. 320 Charles Street,
>Cambridge, MA. 02141. Tel: 617-252-1925 Fax: 617-258-0903
>Bioclusters maillist - Bioclusters at bioinformatics.org
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
Bioclusters maillist - Bioclusters at bioinformatics.org
----- End forwarded message -----
Eugen* Leitl <a href="http://leitl.org">leitl</a>
ICBM: 48.07078, 11.61144 http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 198 bytes
Desc: not available
More information about the Beowulf