SSE & compilers

Don Holmgren djholm at
Tue Aug 21 07:50:27 PDT 2001

1. I've written SSE versions of several of the low level linear algebra routines
specific to our work (su3 matrix-matrix and matrix-vector manipulations - these
are 3x3 complex matrices and 3x1 complex vectors).  Details about performance on
P-III and P-IV, the kernel patch from Andrea Arcangeli necessary for the 2.2
kernels, and using the NASM assembler (as well as the new binutils and gcc) are

2. The Intel compilers (C/C++/Fortran) currently on beta release (i.e., free for
a little while) support SSE/SSE2 and will do some vectorization.  The C compiler
does not succeed in vectorizing su3 algebra.  On P-IV's we've seen substantial
improvement on floating point intensive code with this compiler.  

Don Holmgren

On Tue, 21 Aug 2001, Dan Kirkpatrick wrote:

> 1. What do you know about SSE ? Apparently the p III and P 4 have extra 
> hardware for fp work which is ignored by current compilers but can buy big 
> factors in speed...
> we can code in assembler for it pretty easily if the PIII prcocessors have 
> it.  Comments?
> 2. Apparently there are several optimizing compilers out there (like 
> portland) which do better than gcc.  Any suggestions?  Information on costs?
> Thanks!
> Dan
> =======================================================
> Dan Kirkpatrick                   dkirk at
> Computer Systems Manager
> Department of Physics
> Syracuse University, Syracuse, NY
>    Fax:(315) 443-9103
> =======================================================
> _______________________________________________
> Beowulf mailing list, Beowulf at
> To change your subscription (digest mode or unsubscribe) visit

More information about the Beowulf mailing list