### Faster Cube Root

Over the weekend, I decided to do some spelunking for a faster cube root. After a lot of searching, I realized most of the existing implementations (Kahan/BSD, Turkowski, et al.) rely on Newton-Raphson.

I found a little gem in a 1942 (!) volume of the Journal of the American Statistical Association by Otis Lancaster,

Given an approximation

In my tests, at the cost of an extra add* and multiply, Lancaster converges faster than Newton-Raphson--the number of bits of precision seems to triple with each iteration, which is better than Newton-Raphson's doubling. With two iterations, it's the difference between a 4X improvement and a 9X improvement (quadratic vs cubic convergence).

Consequences. Your initial guess doesn't have to be much better than 5 bits of precision to get 16 bits of precision with one iteration of Lancaster. With a 6 bit initial guess, two iterations of Lancaster will give you near double precision (52 bit). If an initial guess is in the 7 to 8 bit range, one iteration will give you single precision (23 bit).

This is another subject I hope to write more about.

*The (a3 + R) add may be done only once.

** I've realized Lancaster's equation is the derived from Halley's method. More here.

I found a little gem in a 1942 (!) volume of the Journal of the American Statistical Association by Otis Lancaster,

*Machine Method for the Extraction of Cube Root*. With a little massaging of Lancaster's equation**, I offer you:

double Lancaster(const double a, const double R)

{

const double a3 = a*a*a;

const double b= a * (a3 + R + R) / (a3 + a3 + R);

return b;

}

Given an approximation

**a**of the cube root of**R**, the function will return an approximation of the cube root of**R**that is closer than**a**.In my tests, at the cost of an extra add* and multiply, Lancaster converges faster than Newton-Raphson--the number of bits of precision seems to triple with each iteration, which is better than Newton-Raphson's doubling. With two iterations, it's the difference between a 4X improvement and a 9X improvement (quadratic vs cubic convergence).

Consequences. Your initial guess doesn't have to be much better than 5 bits of precision to get 16 bits of precision with one iteration of Lancaster. With a 6 bit initial guess, two iterations of Lancaster will give you near double precision (52 bit). If an initial guess is in the 7 to 8 bit range, one iteration will give you single precision (23 bit).

*Note: Kahan's cbrt( ) includes a bit hack that gives a 5-bit initial guess.*This is another subject I hope to write more about.

*The (a3 + R) add may be done only once.

** I've realized Lancaster's equation is the derived from Halley's method. More here.

*(Excuse the lack of indentation in the function. Blogger is still horrible at this.)*
## 0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home