[Beowulf] Re: vectors vs. loops
Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.
Eugen Leitl eugen at leitl.orgFri May 6 05:36:14 PDT 2005
- Previous message: [Beowulf] Re: vectors vs. loops
- Next message: [Beowulf] Re: vectors vs. loops
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Wed, May 04, 2005 at 03:03:51PM -0400, Robert G. Brown wrote: > On Wed, 4 May 2005, Eugen Leitl wrote: > > There are tricks to optimize available memory bandwidth on modern x86 > > architectures though, as described in > > > > http://leitl.org/docs/comp/AMD_block_prefetch_paper.pdf > > > > (and far more in http://leitl.org/docs/comp/AMD64softoptguide.pdf ). > > Awesome documents -- very informative! I'm saving copies for my own Thanks! That's the reason I mirrored them. > edification (presuming that is permitted by their respective licenses). These are freely available whitepapers and manuals from AMD. I haven't seen any license restricting their use. > Do you have any idea how the "fully optimized loops" in the example code > compare timewise to gcc results for obvious implementations of the same > loops, or ditto for other compilers? How necessary is it for us to I don't recall whether they posted all the benchmarks, but IIRC the pure C variant doesn't give the 300% (vs naive) boost, as the compiler doesn't generate the required (MOVNTQ?) instruction. > start inlining assembler in order to get a threefold improvement in > effective throughput in a straightforward core loop? Do compilers I think you have to use assembler inline for the full speed bost. > automatically use block prefetch and three phase implementations of the > floating point involved? Given that gcc 4.0 is ante portas, things might have changed in the respect. If you do benchmarks, can you please post full numbers? -- Eugen* Leitl <a href="http://leitl.org">leitl</a> ______________________________________________________________ ICBM: 48.07078, 11.61144 http://www.leitl.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE http://moleculardevices.org http://nanomachines.net -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://www.scyld.com/pipermail/beowulf/attachments/20050506/dd7feb28/attachment.bin
- Previous message: [Beowulf] Re: vectors vs. loops
- Next message: [Beowulf] Re: vectors vs. loops
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Beowulf mailing list
