SSE & compilers
David van der Spoel
spoel at xray.bmc.uu.se
Tue Aug 21 07:11:59 PDT 2001
On Tue, 21 Aug 2001, Dan Kirkpatrick wrote:
>1. What do you know about SSE ? Apparently the p III and P 4 have extra
>hardware for fp work which is ignored by current compilers but can buy big
>factors in speed...
>we can code in assembler for it pretty easily if the PIII prcocessors have
We have rewritten the inner loops of our molecular dynamics code GROMACS
(http://www.gromacs.org) using SSE (and 3DNow!). This roughly doubles the
performance on Pentium 3's. You do need a 2.4 kernel for SSE, 3DNow will
work with recent 2.2 kernels as well. Note that it is single precision
only on P3's. P4's can also do double precision SSE, but only with twofold
unrolling (SIMD) compared to 4-fold in single precision. This effectively
halves the performance.
>2. Apparently there are several optimizing compilers out there (like
>portland) which do better than gcc. Any suggestions? Information on costs?
YMMV but Portland did not help more than 5% on our MD code (the C/Fortran
version that is).
Dr. David van der Spoel, Biomedical center, Dept. of Biochemistry
Husargatan 3, Box 576, 75123 Uppsala, Sweden
phone: 46 18 471 4205 fax: 46 18 511 755
spoel at xray.bmc.uu.se spoel at gromacs.org http://zorn.bmc.uu.se/~spoel
More information about the Beowulf