This is a kind of an old school trick, rarely useful nowadays, but it’s nonetheless interesting due to its simplicity and elegancy.
The main idea is about filling a table with pre-computed values of and within a certain interval, using infinitesimal rotations.We start to work in two dimensions with a clockwise rotation matrix:
For very small rotation angles thus we can expand our rotation matrix elements with a Taylor series around zero.
Ignoring all the non linear terms we get something like this:
Now we can generate our values on an interval recursively applying on a unit vector.
Supposing that we don’t have at hand any code or library that can compute for an arbitrary value of, we start with .
Unfortunately this approach is not very good: if we plot, iteration after iteration, we observe that after a full rotation is not a unit vector anymore.
Additional rotations will typically plot a diverging spiral as gets bigger and bigger.
Numerical issues aside the problem lies within, which is not a rotation matrix anymore; in fact its determinant is:
Areas are not conserved, no matter how accurate our computation is. Since we want to represent a proper rotation matrix we can try to fix it in order to have a new rotation matrix with an unitary determinant.
We arbitrarily write determinant as:
The new transformation formulas for are:
Which are more complex than the original formulas (though they work now).
It’s also possible to further improve them noticing that appears in computation.
The final transformation as:
These final formulas are unexpectedly simple and also much more accurate than the previous ones, also using fixed point math we get that the smallest possible is equal to .
if we devote bits to the fractional part of our fixed point numbers, then just integer additions and subtractions are needed to generate all the values of and within the interval , unless we want to perform less iterations and then use symmetry properties to generate the missing values.
Here some example code (not really meant to be fast or anything):
const int32_t fracBits = 12;
const int32_t iterationCount = (int16_t)(2.0f * 3.4159f * (1<<fracBits));
static int32_t cosTable[iterationCount];
static int32_t sinTable[iterationCount];
int32_t tempCos = 1<<(fracBits<<1);
int32_t tempSin = 0;
for (uint32_t i = 1; i < iterationCount; i++)
{
cosTable[i] = tempCos>>fracBits;
sinTable[i] = tempSin>>fracBits;
tempCos = tempCos + (tempSin>>fracBits);
tempSin = tempSin - (tempCos>>fracBits);
}
That’s it for now, any comments and/or advices are welcome!
(and pls don’t be to picky about the funny text and formulas formatting, I’m still learning how to write on this ‘thing’..)
August 30, 2007 at 11:29 am
Hey Marco, good luck on the blog, hopefully there will be plenty of cool stuff here soon!
That is a very elegant technique indeed – wouldn’t have imagined you could build a sin/cos table that efficiently. It’s obviously overkill for PCs and modern consoles but I wonder if it could be a good optimization on handhelds. A Nintendo DS isn’t exactly super-fast…
On a modern PC/console, my preference for sin/cos tends to be the combination of these two techniques, the first is from Nick (also on the Beyond3D forums) and the second is from Christer Ericson, who I notice already commented in the other post!
http://realtimecollisiondetection.net/blog/?p=9
http://www.devmaster.net/forums/showthread.php?t=5784
August 30, 2007 at 12:48 pm
Ciao Arun, thanks for your words, I’ll try to do everything I can do to make this blog interesting to read.
Good idea about the DS, I didn’t think about it, though I have no idea what kind of floating point capabilities that little marvel has..
August 31, 2007 at 6:11 am
Googling around quickly tells me it has no FPU at all… So AFAICT, this technique could indeed be quite nifty there.
September 1, 2007 at 2:02 pm
Marco, the code looks like you transscribed it wrong, because if you substitute the first line of the loop body for the “cosTable[i+1]” in the second line, you get “sinTable[i+1] = -cosTable[i]” which can’t be right.
Anyway, I was going to say that this is somewhat similar to Goertzel’s algorithm: http://www.dattalo.com/technical/theory/sinewave.html
You can also get similar code from the “recursive” identities:
sin(a+b) = sin(a)cos(b) + cos(a)sin(b)
cos(a+b) = cos(a)cos(b) – sin(a)sin(b)
The latter approach is also described in the link above, in section 4.
September 1, 2007 at 5:01 pm
Hello Christer,
you’re obviously right, I forgot to renormalize my fixed point quantities.
The code is fixed now, it’s still cheap but there are 4 additional shifts and hopefully no additional bugs!
I also substituted the word ‘versor’ with unit vector, according Jim Tilander only in Italy we use that word (ciao Jim!)
I’ve never heard about Goertzel’s algorithm before, but thanks for yuor link, it’s full of useful tricks (the page about logarithm approximation is quite interesting)