Archives


- Beowulf
- Beowulf Announce
- Scyld-users
- Beowulf on Debian

[Beowulf] AMD64 results...

Many of your questions may have already been answered in earlier discussions or in the FAQ. The search results page will indicate current discussions as well as past list serves, articles, and papers.

Search

Josip Loncaric josip at lanl.gov
Thu Dec 16 08:17:06 PST 2004


Robert G. Brown wrote:
> [...]   One can see how having 64 bits would really
> speed up 64 bit division compared to doing it in software across
> multiple 32 bit registers...

Correct me if I'm wrong, but doesn't the floating point unit normally 
use an internal iterative process to perform the division?  This would 
not involve 32-bit registers...

I'm not so sure about *integer* 64-bit division.  Integer division may 
involve multiple 32-bit integer registers.

Good ole' Cray-1 used an iterative process for floating point division 
which worked like this: given a floating point number x, use the first 8 
bits of the mantissa to index into a lookup table containing initial 
guesses, then do a few steps of Newton-Raphson iteration involving only 
multiply-add operations to get the fully converged reciprocal mantissa, 
fix the exponent, thus obtaining 1/x, then multiply y*(1/x) to get y/x.

As I recall, the famous Pentium FDIV bug involved some corner cases in a 
similar iterative process, all of which is internal to the floating 
point unit.  Moreover, in addition to following the 32/64-bit IEEE 754 
standard for floating point arithmetic, some implementations (e.g. 
Pentium, Opteron) support x87 legacy internal 80-bit representations of 
floating point numbers, which can really help when accumulating long 
sums and computing square roots, etc.  Prof. Kahane has numerous 
arguments in favor of this internal 80-bit representation...

Sincerely,
Josip



More information about the Beowulf mailing list