Welcome back to our "Binary Tree" week!
Today we will discuss the enum
function. It can be used to emulate arrays (as you might have noticed, PicoLisp doesn't have an array data type). But what do arrays have to do with binary trees...?!
The enum
function is only available in pil21. If you get a enum -- Undefined
error when you executing the examples below, you probably haven't installed the latest version.
The enum
function
Let's see what we can learn from the documentation about the enum
function.
(enum 'var 'cnt ['cnt ..]) -> lst
(enum 'var) -> lst
Enumerates cells by maintaining a binary tree in
var
. The keys are implicit from the enumeratedcnts
, and the resulting tree is balanced (independent from the insertion order). (...)
enum
can be used to emulate (possibly sparse) arrays.
The enum
function expects a cnt
- which means a (small) integer - which specifies the position of a cell in a balanced tree. Like for any cell, the corresponding value could be defined with the set
function.
The cell is returned if it already exists, otherwise it is created as a new node in the tree. A typical use case would be the conversion of a list, as in the following example:
: (for (I . S) '(a b c d e f g)
(set (enum 'E I) S) )
-> o
The for
loop takes the index (I
) and value (S
) of each list item. I
is stored in the enum
tree, then its value is set to S
.
Let's check the result:
: (view E T)
g
c
e
a
f
b
d
We see a perfectly balanced tree. To read out some values, we use val (enum ...) )
:
: (val (enum 'E 1))
-> a
: (val (enum 'E 3))
-> c
We can check if a cell already exists using the enum?
function. If it exists, the whole subtree is returned, otherwise NIL
.
: (enum? E 3)
-> (c (e) g)
: (enum? E 10)
-> NIL
cache
and enum
- what's the difference?
enum
doesn't store any keys. The key is implicit due to its position in the tree.
On the other hand,cache
is creating the index using hash function, which implies that any key can be stored. The backdraw is that the hashing step needs to be done everytime an item should be looked up.cache
accepts a program as argument. If the program has already been evaluated, the value is returned without re-evaluating.enum?
can only check if an index exists or not.enum
is creating balanced trees,cache
is creating probably balanced trees (in real applications, this gap should be rather small)
Array emulation
Probably you can already guess in which regards enum
is similar to arrays: Arrays can usually be indexed directly by a syntax like myArr[1]
. enum
allows a very similar handling by (val (enum 'E 1))
.
But how does it look like in terms of efficiency? Let's take a list of 1 Mio. items and assume we want to look up the item at position 900.000.
Arrays. Usually arrays take up consecutive space in the memory. If we know the size per entry, the computer "knows" the required position by calculating "starting point + 900.000*size". --> approx. 1 calculation
Lists. In PicoLisp, lists are connected via pointers. Therefore we need to jump from pointer to pointer to get to our desired cell --> approx. 900.000 calculations.
enum/binary tree. The
enum
function guarantees that the tree is balanced, so we should be able to find our item within 20 steps --> approx. 20 calculations.
Multidimensional Arrays
We can even create multidimensional arrays, since enum
accepts more than one cnt
variable. Let's create a 4x4 Matrix:
: (off A)
: (for I 4
(for J 4
(set (enum 'A I J) (pack I "-" J)) ))
To get the item in the "second line, second row":
: (val (enum 'A 2 2))
-> "2-2"
Good old Fibonacci, revisited
And now finally, let's consider our Fibonacci example again and compare it to the cache
version. We can expect a difference, because we can store and access the variables directly under the iteration number N
.
However, it requires some further work from us. While cache
accepts a program as parameter and evaluates it only in case the value has not been stored yet, we need to do this manually for enum
:
- check the value of
enum '(NIL) N
and set it toE
. If it has already been calculated,E
will be non-NIL
, otherwise it will beNIL
. - If
(val E)
evaluates to non-NIL
(i. e. has already been calculated), this value is returned. - Otherwise we set
E
to the result of the recursive calculation.
(de fiboEnum (N)
(let E (enum '(NIL) N)
(or
(val E)
(set E
(if (>= 2 N)
1
(+ (fiboEnum (dec N)) (fiboEnum (- N 2))) ) ) ) ) )
In the previous post, we showed that the cached version can calculate 10.000 Fibonacci numbers in 0.1 s (of course depending on the respective hardware). How many can we do with fiboEnum
in the same time?
: (bench (fiboEnum 14500))
0.098 sec
We get close to it at Fibonacci number 14500. This a speed increase of almost 50% compared to the cache
version!
If you still haven't seen enough of Fibonacci, stay tuned for the next Rosetta-Code post which will include analytic and iterative approaches, and the second Euler Project challenge where we will sum up the even Fibonacci numbers.
After that we will start a new topic - Web Application Programming ๐
Sources
picolisp.com/wiki/?ArrayAbstinence
software-lab.de/doc/index.html