In the previous posts, we discussed the theory of binary trees. Now we will look at at two actually useful implementations of binary trees: the
cache function and the
Obviously you can also use those functions without knowing anything about binary trees - but I think it's always nicer to understand what's going on. So let's go.
As the name already tells us, the
cache function caches previously calculated results for us. Let's study its syntax first. From the documentation we see that
cache requires a variable to store the cache, as well as a variable and optionally a program.
(cache 'var 'any [. prg]) -> any
Speeds up some calculations by maintaining a tree of previously calculated results in an
idxstructure ("memoization") in
var. A hash of the argument any is used internally to build the index key. If no
prgis given, the internal var holding a previously stored value is returned (note that var may have a name which is not human-readable).
: (off C) # empty cache -> NIL # store some values in C : (cache 'C 'c1 (* 3 4)) : (cache 'C 'c2 (* 4 5)) : (cache 'C 'c3 "hello") : (cache 'C 'c4 "hello") : (cache 'C 'c5 (+ 10 1))
As you can see,
cache accepts both symbols (like "hello") or programs (like
(* 3 4) as arguments. This will be very useful later on.
What the cache function does is hashing the key (
c1, c2..) to a 16-Bit number, and using it to build the index key. This does not guarantee that the tree is balanced, but due to the randomization the results should be quite good especially when we have a lot of numbers. Let's check:
: (depth C) -> (4 . 3) : C -> (("볐" . c1) (("벪" . c2) (("粋" . c3) (("籇" . c5)) ("뱦" . c4)))) : (view C T) ("볐" . c1) ("벪" . c2) ("뱦" . c4) ("粋" . c3) ("籇" . c5)
You may wonder why the key was stored in Chinese?! Well, of course it is not Chinese - by coincidence the hash-generated 16 bit numbers correspond to the UTF values of some Chinese characters.
The tree depth is 4, and the root node doesn't have a right child. Thus we can see that the tree is not balanced, but with increasing numbers of keys this will certainly improve.
val (cache ...) lets us check the stored values:
: (val (cache 'C 'c5)) -> 11 : (val (cache 'C 'c1)) -> 12
And now let's see why this is useful.
Good old Fibonacci
A good example is the famous recursive implementation of the Fibonacci sequence function. For those who forgot:
The Fibonacci sequence is a sequence where each number is the sum of the two preceding ones, starting from 0 and 1. (0, 1, 1, 2, 3, 5, 8, 13...).
A recursive implementation of the Fibonacci sequence could look like this:
: (de fibo (N) (if (>= 2 N) 1 (+ (fibo (dec N)) (fibo (- N 2))) ) )
It looks quite elegant, but actually it creates a gigantic overhead. Let's analyze what happens if we call
(fibo (dec N)) side is evaluated:
- Step 1
fibo 5: calls
- Step 2
fibo 4(from Step 1): calls
- Step 3
fibo 3(from Step 1): calls
- Step 4
fibo 3(from Step 2): calls
(fibo (- N 2)) side is evaluated bottom-up:
- Step 5
fibo 3(from Step 4): calls
- Step 6
fibo 3(from Step 3): calls
and so on. Every step produces two new steps, which means the number of calculations grows exponentially. To calculate
fibo 6, we enter the function 25 times! Obviously, we are calling the same functions again and again.
How can we improve this? The obvious (and boring) option would be not to use a recursive function. The second option is to use caching, i. e. storing the already calculated results so that we don't have to do those again.
As we learned from the definition above, the
cache function can take a program as argument. In our case, this program should be the whole Fibonacci calcuation, and it is stored under the current iteration value
N in an empty list
: (de fiboCache (N) (cache '(NIL) N (if (>= 2 N) 1 (+ (fiboCache (dec N)) (fiboCache (- N 2))) ) ) )
Everytime the program comes across a key that has been already cached, the cached value is returned instead of evaluating the program. Therefore, if we call
fiboCache 6, the program body is evaluated exactly 6 times - after that everything can be found in the cache.
Let's use the
bench function to compare the cached and the standard version:
: (bench (fiboCache 10000)) 0.101 sec : (bench (fibo 30)) 0.100 sec
This shows that the cached version can calculate 10.000 Fibonacchi numbers in 0.1 s, while the non-cached version can get only 30 numbers!
In the next post, we will have a look at the function
enum, which is also internally represented by a binary tree. As we will see, it can be used to emulate (possibly sparse) arrays.