[Beowulf] OT? GPU accelerators for finite difference time domain
mfatica at gmail.com
Sun Apr 1 12:30:32 PDT 2007
CUDA comes with a full BLAS and FFT library (for 1D,2D and 3D transforms).
You can have relevant speed up even for 2D transforms or for a batch of 1Ds.
You can offload only compute intendive parts of your code to the GPU
from C and C++ ( writing a wrapper from Fortran should be trivial).
The current generation of the hardware supports only single precision,
but there will be a double precision version towards the end of the
PS: I work on CUDA at Nvidia, so I may be a little biased...
On 4/1/07, Mark Hahn <hahn at mcmaster.ca> wrote:
> as far as I know, there are not any well-developed libraries which simply
> harness whatever GPU you provide, but don't require your whole program to
> be GPU-ized. the cost of sharing data with a GPU is significant, but
> blas-3 might have a high enough work-to-size ratio to make it feasible.
> 3d fft's might also be expressible in GPU-friendly terms (the trick would
> be to utilize not fight the GPU's inherent memory-access preferences.)
> perhaps some MCMC stuff might be SIMD-able? I doubt that sequence analysis
> would make much sense, since GPUs are not well-tuned to access host memory,
> and sequence programs are not actually that compute-intensive. I'd guess
> that anything involving sparse matrices would be difficult to do on a GPU.
More information about the Beowulf