[Beowulf] Nvidia GT200, double precision vs. native pair

Thu Sep 4 12:54:20 PDT 2008

On Thu, Sep 04, 2008 at 09:56:13AM -0600, Craig Tierney wrote:
> Subject: Re: [Beowulf] Re: Beowulf Digest, Vol 55, Issue 2

> This is not correct.  The NVIDIA GT200 series supports IEEE DP FP in
> hardware.  NVIDIA only has 1 DP FP unit per streaming processor (24
> on the GTX280) which is 1/8 the number of units of single-precision
> floating point (each thread has its own unit).  So the max DP FP
> rate on a GTX280 is about 90 Gflops.

So has anyone taken those 8 single-precision floating point units and
tried using them to get double-precision or better accuracy?  Perhaps
using the "native-pair" and "speculative precision" approaches
discussed here:

  http://aggregate.org/NPAR/

The 2006 paper there talks about doing so on a Nvidia GeForce 6800
Ultra, on which a (c. 64 bit) native-pair calculation took about 10x
the clock cycles of a single 32 bit flop (better for sqrt).

-- 
Andrew Piskorski <atp at piskorski.com>
http://www.piskorski.com/