Garklein's stuff

Estimating ln(x) in colorForth (without floats)
===============================================

Motivation [skip to avoid a Project Euler spoiler]
--------------------------------------------------
You can solve Project Euler 25 in 2 ways. The most obvious way is to use bignums to generate Fibonacci numbers until you've found the first with over 1000 digits. But I wanted to get a little spicier.

As it turns out, the answer to this problem is

          ln(5)/2 + 999 ln(10)
1 + floor --------------------
                 ln(φ)

The reasoning is left as an exercise to the reader ;)

Now, we're faced with an issue: how do we compute ln(x) in colorForth?


Fixed point arithmetic
----------------------
For that matter, how can we represent decimal numbers at all? There are no floats in colorForth; all we have to work with are 32-bit ints.

For a bit of background, Forths traditionally have no floating point. Instead, they use *Fixed Point*.

In fixed point arithmetic, we treat each number as if it has an implicit denominator. For example, we might say that 1 represents 1/100, 2 represents 2/100, 500 represents 500/100 (aka 5), etc.

With this representation, we get addition and subtraction for free - to add/subtract fractions with like denominators, you just add/subtract the numerators. Multiplication and division by integers are also the same as normal. Multiplication and division of one fixed point number by another, however, require a bit more thought.

Suppose we have two fixed point numbers x and y, and we want to compute their product. x represents x'/100 and y represents y'/100. The actual value of their product is x'y', which means that our desired fixed point representation is x'y'/100.

If we solve for x' and y', we get that x' = 100x and y' = 100y. Subbing this into x'y'/100 gives us

100x * 100y
-----------,
    100

or simply xy/100.

Now, the especially astute ones among you might spot an issue. For large x and y, computing their product will overflow a mere 32-bit integer. To resolve this, colorForth provides a special word: */.

x y z */ computes xy/z (I like to read it as multiplying by a fraction: x * y/z), but uses a 64-bit intermediate result for the product. This greatly reduces the chance of overflow.

With 100 as our fixed point denominator, multiplication is x y 100 */. Similarly, division is x 100 y */.

Let's put this into practice with some colorForth code. I'll be using an arbitrary power of two as my denominator, say, 2^19. We can get a bit cheeky here when defining it. Our denominator is our way to represent 1, so we can call the word 1.

1 -n 80000 ;

When defining a number as a word, we need to use its text representation, which is typed with a backtick before the number. Visually, our 1 is identical to the number 1; and it only differs in its internal representation and how we type it. Being unable to tell whether a stray 1 in source code means 1 or 2^19 is perhaps a bad idea, but we roll with it.

To convert x to fixed point, we use x 1 *. To multiply or divide fixed point x and y, we use x y 1 */ and x 1 y */ respectively. Here, the fraction view is helpful: multiplying is x * y/1 and dividing is x * 1/y - exactly like you'd expect!

Here are some helpers to convert a number to fixed point, and to do fixed point division.

*1 n-n 1 * ;
*1/ nn-n 1 swap */ ;


Logging time!
-------------
Now that we've got fractions, we can get to the ln part.

There *is* one "log" that is very easy to compute - floor(log2(x)). This is just the index of the highest set bit.

From here, it's clear that we can represent any number x as 2^k * m, where k is a natural number (the floor(log2(x)) from earlier) and m is in the real interval [1, 2).

ln(x)
= ln(2^k * m)
= ln(2^k) + ln(m)
= k ln(2) + ln(m)

Now, we just need to be able to compute ln(x) over the interval [1, 2], and we can do just that with a Taylor series.

Working out the Taylor series for ln(x) around x=a gives us

                 ∞                      n
                \--    n+1   -n  (x - a)
ln(x) = ln(a) +  >  (-1)  * a  * -------
                /--                 n
               n = 1

Time to implement this in colorForth!

I decided to use a word tayl to compute the sum, which will take a and x from the stack.

tayl will use a for loop to calculate each term, with the loop's index being n. We need some set up for this loop. There needs to be an accumulator somewhere, to hold the sum so far. Since stack manipulation gets messy, we want to minimize manipulation inside the loop. The variable a is never needed on its own, only x - a, so we can pre-compute that.

This word rearranges the stack to 0 (the accumulator), then x - a, then a.

0,x-a,a ax-cya 0 -rot swap dup negate u+ ;

Inside the loop, it will be useful to be able to get n from a nested word. Inside the for loop, the index will be on the return stack. When we call a word (that, say, computes one part of the term), this pushes an address onto the return stack. When that word calls n, it will push again onto the return stack. So, n needs to pick the 3rd element from the return stack.

n -n 2pop i -rot 2push ;

2pop pops the two addresses off the return stack, i copies the index, -rot buries it under the two addresses, and 2push pushes the addresses back on.

There are two terms which each need to be raised to the power of n, so let's make a helper for that.

**n n-n 1 n for over 1 */ next nip ;

This word takes in the base that should be raised to the power of n. It starts with (our) 1, then multiplies it by the base n times. Finally, it nips the base off the stack. Since the colorForth :j<for> loop bricks when looping 0 times, this will fail when n = 0. Luckily, we sum starting at n = 1.

Now, let's look at the sum itself. We might be tempted to first calculate (x - a)/a (or x/a - 1), then raise that to the n, but I fear we would lose precision. Instead, I will calculate (x - a)^n and a^n separately, then divide them.

After our 0,x-a,a setup, x - a will be below the top; so we can over **n to compute (x - a)^n. Our a value now is second from the top. over **n *1/ will raise it to the power and divide.

Next, we need to divide by n -- a simple i /. Since our divisor is a whole number and not fixed point, we can use normal division.

Finally, we need to fix the sign. This term should be added if n is odd, and subtracted if n is even.

sign n-n n 1 ? drop if ; then negate ;

1 ? tests n to see if the lowest bit is set. If it is, n is odd, and we just return. Otherwise, n is even, and we negate the number.

At the end of each loop iteration, we need to add this to the accumulator, which is now buried under our term, a, and x - a.

acc cnn-cnn push rot pop + -rot ;

A push is needed to temporarily offload our new term to the return stack. Then, rot pries the accumulator up from under a and x - a, we add the term, and we send the accumulator back down.

As for the loop itself, we will use 16 for ... next to loop from 16 to 1. After the loop, all that's next is to discard x - a and a. Putting it all together gives

tayl xa-n 0,x-a,a 16 for,
..over **n over **n *1/ i / sign acc next,
..2drop ;

Our final Taylor series will be used on the interval [1, 2]; so I think picking a = 3/2 makes sense.

To compute the Taylor series of ln(x) around a = 3/2, we first need to compute ln(3/2). We can use the the Taylor series of ln(x) around a = 1 for this. We know that ln(1) = 0, so we won't need to add anything to our result from tayl. This means that ln(3/2) is simply 1 3/2 tayl.

Since this value is used a lot, I'll compute it at compile time. In colorForth, a transition from yellow to green at compile time takes the top stack element and compiles it as a literal. In other words, it saves the top compile time stack element, and pushes it onto the stack at runtime.

Now, we can finally estimate ln(x) around a = 3/2! I'll call this function ln'.

3/2 3 *1 2 / ;
ln' n-n 3/2 swap tayl 1 3/2 tayl + ;


It's all coming together
------------------------
If you remember from above, ln(x) = k ln(2) + ln(m) when x is of the form 2^k * m. We now have all the parts, so all that's missing is the final assembly.

We can obtain k from finding the most significant set bit. Then, m is x/2^k. Since finding the highest set bit and raising 2 to the n are both clunky and inefficient in colorForth, we'll implement them in machine code.

To find the highest set bit, we can use bsr, which has opcode 0f bd. We want to find the most significant bit of the top of the stack, which in colorForth is always held in eax. This result should go back onto the top of the stack, so our modr/m byte will be 11 (both operands are registers) 000 (eax) 000 (eax), or c0.

macro*bsr n-n c0bd0f 3, ; forth

bsr calculates floor(log2(x)) for an integer. However, our input is in fixed point format, so we first need to divide by 1. I will call this word log2', since floorlog2 is quite long.

log2' n-n 1 / bsr ;

To compute 2^n, we can start with 0, then use bts to set the desired bit. Since we'll be zeroing eax, we need to first save its value into another register. 8b for mov, then 11 001 (ecx) 000 (eax) is c8.

Next, we need to zero eax, which we can do by xoring it with itself. 31 (xor) 11 000 000 = c0.

Finally, we use bts (0f ab). It takes the destination as the r/m operand and the bit to set as the r operand, so our modr/m byte will be 11 001 000 = c8.

macro*2** n-n c031c88b , c8ab0f 3, ; forth

We calculate the decimal part of ln(x) with ln(m). m = x/2^k, so this word will need to take both x and k from the stack.

decimal nn-n 2** / ln' ;

The whole part (from a log2 perspective) is k ln 2, so this only needs k from the stack.

whole n-n 2 *1 ln' + ;

Now, we just need to calculate k, call decimal and whole, and add them together.

ln n-n dup ln2' push i decimal pop whole + ;

k is needed by both functions. For clarity, I push it to the return stack. The first time we need k, I use i; for the second and final time, I pop it.

Time to try it out! Let's try ln(12345/42): 12345 1 42 */ ln

colorForth responds with 2979694.

ln(12345/42) = ~5.6833369
2979694/2^19 = ~5.6833153

Not bad! 2^19 * ln(12345/42) rounds to 279705, so we're only off by 9 * 2^-19.


Our code isn't divine enough!
----------------------------
If you remember the original equation, we need ln(φ). A trick I learned from K is that φ can be approximated by repeated iteration of f(x) = 1 + 1/x. If you're curious why this works, 3b1b has a great video.

phi -n 1 16 for 1 swap *1/ 1 + next ;


The results
-----------
Our final calculation is

pe25 5 *1 ln 2 / 10 *1 ln 999 * + phi,
..ln / 1 + ;

which gives the right answer!

If we remove the division and rounding, we get that our final fraction is 1206428689/252292, which is ~0.015 off from the correct answer.

The multiplication by 999 really compounds the error from the ln(10). Our estimation for ln(10^999) is 1206006786, while the true answer is ~1206010516: a difference of ~3730 * 2^19.

Another chasm of lost precision occurs when dividing, because the numerator is so much greator than the denominator. Even adding 1 to the ln(phi) approximation (changing the denominator by 2^-19) massively affects the final answer by ~0.015.

Of course, this code could be made more accurate by using precomputed values for ln3/2 and ln2, but I wanted to do everything from scratch. After all, if you're hardcoding any values at all, why not just skip straight to hardcoding the final answer?

The full code can be found on block 908 of my Project Euler solutions.


~/cf/ln.html