[Beowulf] Re: vectors vs. loops
landman at scalableinformatics.com
Tue May 3 04:58:23 PDT 2005
With all due respect, I think Robert is correct. The majority of
existing scientific code base is serial code with very limited (if any)
vectorizable content, or parallelizable content. This is in large part
due to the way people write them.
Aside from that, simply looking at the field of bioinformatics, I would
be hard pressed to find a code capable of usting long vectors out of the
box (HMMer can use short vectors, and even then a little rewriting is
needed). These codes tend to be trivially/embarrassingly parallel though.
The issue may be one of labeling. What you interpret as "the majority
of scientific codes" may be very different than what I interpret as "the
majority ..." and what Robert interprets as "the majority ...".
Supercomputing and high performance computing in a more general sense,
is not just about numerically intensive codes anymore. IDC reports that
the largest fractions of machine purchased for HPC in recent years have
been going for "scientific research" and "life science computing". The
latter is effectively unvectorizable, and the former has a small
fraction of overall content that is vectorizable.
The question is whether or not you consider BLASTing 10000 ESTs vs the
nt database to be a supercomputing problem. I do (as do a fair number
of others). Many old-timer linear algebra folks do not.
Joachim Worringen wrote:
> Robert G. Brown wrote:
>> However, most code doesn't vectorize too well (even, as you say, with
>> directives), so people would end up getting 25 MFLOPs out of 300 MFLOPs
>> possible -- faster than a desktop, sure, but using a multimillion dollar
>> machine to get a factor of MAYBE 10 in speedup compared to (at the time)
>> $5-10K machines. In the meantime, I'm sure that there were people who
>> had code that DID vectorize well pulling their hair because of all those
>> 100 hour accounts that basically wasted 90% of the resource.
> This general statement is just wrong. Many scientific codes *do*
> vectorize well, and in this case, you do not get <10% of peak as you
> indicated, but typically 30 to 50% or even more (see i.e. Leonid
> Oliker's recent papers on this topic) for the *overall application* (the
> vectorized loop alone is close to 100%) . This is a significant difference.
> Another 'legend' comes up in your statement: a single node of a vector
> machine does not cost "multimillion" dollars. The price factor is quite
> close to the performance factor for suitable applications.
>> My own code doesn't vectorize too well because it isn't heavy on linear
>> algebra and the loops are over 3 dimensional lattices where
>> nearest-neighbor sums CANNOT be local in a single memory stream and
> Real vector architectures have very efficient scatter/gather memory
> operations, and support indirect addressing efficiently as well.
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452
cell : +1 734 612 4615
More information about the Beowulf