I think you are refering to the SSE2 instructions available in (pretty
much) all x86 CPUs theses days. There are also the ACML math libraries for
Opterons which are optimized for BLAS routines. You may here these refered
to as the 'vectorized' libraries.

Go to developer.amd.com for links to the libraries and various
presentations.  In particular you might be interested in some of Tim
Wilken's presentations there.


AMD has advertised that their 64 bit architectures have vector
registers. This was initially very exciting because eigensolvers are not
easily paralllized for medium-sized problems, but vectorize very well.
I say this based on experience with the CRAY-YMP where we saw speed-ups
ofrr 100 when we used a verctorized eigensolver. The Y-MP had, I think
128 vector registers, which could easily account for the 100 speed up.
Does anyone know

1. How many vector registers each of the AMD 64 bit CPU's has
2. Is there a source for the old CRAY vector library?

