[Beowulf] Re: vectors vs. loops

Tue May 3 04:58:23 PDT 2005

With all due respect, I think Robert is correct.  The majority of 
existing scientific code base is serial code with very limited (if any) 
vectorizable content, or parallelizable content.  This is in large part 
due to the way people write them.

Aside from that, simply looking at the field of bioinformatics, I would 
be hard pressed to find a code capable of usting long vectors out of the 
box (HMMer can use short vectors, and even then a little rewriting is 
needed).  These codes tend to be trivially/embarrassingly parallel though.

The issue may be one of labeling.  What you interpret as "the majority 
of scientific codes" may be very different than what I interpret as "the 
majority ..." and what Robert interprets as "the majority ...". 
Supercomputing and high performance computing in a more general sense, 
is not just about numerically intensive codes anymore.  IDC reports that 
the largest fractions of machine purchased for HPC in recent years have 
been going for "scientific research" and "life science computing".  The 
latter is effectively unvectorizable, and the former has a small 
fraction of overall content that is vectorizable.

The question is whether or not you consider BLASTing 10000 ESTs vs the 
nt database to be a supercomputing problem.  I do (as do a fair number 
of others).  Many old-timer linear algebra folks do not.

Joe

Joachim Worringen wrote:
> Robert G. Brown wrote:
> 
>> However, most code doesn't vectorize too well (even, as you say, with
>> directives), so people would end up getting 25 MFLOPs out of 300 MFLOPs
>> possible -- faster than a desktop, sure, but using a multimillion dollar
>> machine to get a factor of MAYBE 10 in speedup compared to (at the time)
>> $5-10K machines.  In the meantime, I'm sure that there were people who
>> had code that DID vectorize well pulling their hair because of all those
>> 100 hour accounts that basically wasted 90% of the resource.
> 
> 
> This general statement is just wrong. Many scientific codes *do* 
> vectorize well, and in this case, you do not get <10% of peak as you 
> indicated, but typically 30 to 50% or even more (see i.e. Leonid 
> Oliker's recent papers on this topic) for the *overall application* (the 
> vectorized loop alone is close to 100%) . This is a significant difference.
> 
> Another 'legend' comes up in your statement: a single node of a vector 
> machine does not cost "multimillion" dollars. The price factor is quite 
> close to the performance factor for suitable applications.
> 
>> My own code doesn't vectorize too well because it isn't heavy on linear
>> algebra and the loops are over 3 dimensional lattices where
>> nearest-neighbor sums CANNOT be local in a single memory stream and
> 
> 
> Real vector architectures have very efficient scatter/gather memory 
> operations, and support indirect addressing efficiently as well.
> 
>  Joachim
> 

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: landman at scalableinformatics.com
web  : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 734 786 8452
cell : +1 734 612 4615