Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] Re: vectors vs. loops

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Eugen Leitl eugen at leitl.org
Fri May 6 05:36:14 PDT 2005


On Wed, May 04, 2005 at 03:03:51PM -0400, Robert G. Brown wrote:
> On Wed, 4 May 2005, Eugen Leitl wrote:

> > There are tricks to optimize available memory bandwidth on modern x86
> > architectures though, as described in
> > 
> > http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf
> > 
> > (and far more in http://leitl.org/docs/comp/AMD64softoptguide.pdf ).
> 
> Awesome documents -- very informative!  I'm saving copies for my own

Thanks! That's the reason I mirrored them.

> edification (presuming that is permitted by their respective licenses).

These are freely available whitepapers and manuals from AMD. I haven't seen
any license restricting their use. 
 
> Do you have any idea how the "fully optimized loops" in the example code
> compare timewise to gcc results for obvious implementations of the same
> loops, or ditto for other compilers?  How necessary is it for us to

I don't recall whether they posted all the benchmarks, but IIRC the pure C
variant doesn't give the 300% (vs naive) boost, as the compiler doesn't
generate the required (MOVNTQ?) instruction.

> start inlining assembler in order to get a threefold improvement in
> effective throughput in a straightforward core loop?  Do compilers

I think you have to use assembler inline for the full speed bost.

> automatically use block prefetch and three phase implementations of the
> floating point involved?

Given that gcc 4.0 is ante portas, things might have changed in the respect.

If you do benchmarks, can you please post full numbers?

-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://www.scyld.com/pipermail/beowulf/attachments/20050506/dd7feb28/attachment.bin


More information about the Beowulf mailing list