G4's for scientific computing

William R. Pearson wrp at alpha0.bioch.virginia.edu
Sun Apr 14 18:57:56 PDT 2002


One of the advantages of the MacOSX gcc compiler is that in line
Altivec instructions are available at a high level.  One can
define vector arrays, and do vector operations from 'C' code, e.g.

	while(vec_any_gt(T2, NAUGHT)) {
	  T2 = vec_sub(LSHIFT(T2), RR);
	  FF = vec_max(FF, T2);
	}

We are testing an Altivec FASTA version; a Altivec BLAST was announced
several months ago.  We like Altivec because we can manipulate 8
16-bit integers or 16 8 bit integers at once - biological sequence
comparison code is essentially all integer.  We see a 6-fold speedups
on when things are done 8-fold parallel.  On our codes a dual 533 G4
and Altivec code is 6X-faster than a dual 1 GHz PIII (we don't have a
GHz G4 yet).  Because of the high level Altivec primitives in the
Apple gcc compiler, vectorizing was very very easy; we would have to
be much more sophisticated to do the same thing on the PIII (and the
potential speed-up would be 1/2 as large, since the vector is 64, not
128 bits).

I might have agreed with the statement that one must have hand-tuned
Altivec code which pretty much excludes general purpose scientific
computing 4 months ago, but our experience has been very positive -
our programs are not specialized signal processing programs, but, in
retrospect, it was easy to get very dramatic speed up.

Bill Pearson



More information about the Beowulf mailing list