Getting started with the PicoLisp database

Getting started with the PicoLisp database

·

4 min read

The following posts are based on this tutorial which was included in a previous release of PicoLisp.


Opening and Writing to a Database file

Let's start with the basics: opening and writing to a database file. We can do this from the REPL ($ pil +):

: (pool "test.db")
T

With this command, a file "test.db" was created in your current folder if it didn't exist yet. The file ending ".db" is not mandatory. The database root is stored in a global constant called *DB. Let's try it:

: *DB
-> {1}

As you can see, the global variable is pointing to an external symbol {1} (external symbol names are enclosed in braces). We can use our "normal" symbol functions to inspect the *DB symbol content or add and change properties (read here for an overview of the most important functions).

For example, let's inspect it with show:

: (show *DB)
{1} NIL
-> {1}

Unsurprisingly, our *DB is empty. Let's add some values. For example, we can add key-value pairs with put, for example a=1, b=2:

: (put *DB 'a 1)
-> 1
: (put *DB 'b 2)
-> 2

Let's inspect the variable again:

: (show *DB)
{1} NIL
   b 2
   a 1
-> {1}

Also, we can modify the value with set. Let's set it to "Hello World". The set function only modifies the val part of a symbol and doesn't change any of the properties. (Here you can learn more about the set function).

: (set *DB "Hello world")
-> "Hello world"

: (show *DB)
{1} "Hello world"
   b 2
   a 1
-> {1}

Why should we use *DB instead of {1}?

Theoretically, *DB and {1} should be equivalent as the first one is only a pointer to the latter one. For example, we could also have written (set '{1} "Hello world"). But in fact this can lead to a memory loss: The garbage collector temporarily sets *DB to NIL and restores its value after collection. If the database is directly accessed, the garbage collector might not be able to free something, which could potentially decrease memory and efficiency.


Creating a new external object

New objects are created with the new function. As we have learned, symbols can be internal, transient or external. If we use new without any parameters, we can create a anonymous symbol:

: (new)
-> $177264230632614

(We have seen this before in the OOP tutorial: all objects created from classes are also transient symbols, as we can see from the $ sign at the beginning of the symbol name.)

If we use a flag T when we call new, we create an external symbol in the database file:

: (new T)
-> {2}

Let's store it in the database root {1}. For demonstration purposes, we are accessing it now directly as {2}.

: (put *DB 'newSym '{2})
-> {2}

: (show *DB)
{1} "Hello world"
   newSym {2}
   b 2
   a 1
-> {1}

Let's modify {2}. For example, we can put another key-value pair inside, like x=777:

: (put *DB 'newSym 'x 777)
-> 777

To only show {2} instead of the full *DB content, we can use the following syntax:

: (show *DB 'newSym)
{2} NIL
   x 777
-> {2}

Committing the changes

Until now, the changes on the *DB symbol were only internal. In order to write them to disk, we need to call (commit). If we prefer to go back to the initial state, we can call (rollback).

: (commit)
-> T

If we now exit the REPL and open the database file, the symbols will still be there: the data is persistent.

$ pil +

: (pool "test.db")              
-> T

: (show *DB)
{1} "Hello world"
   newSym {2}
   b 2
   a 1
-> {1}

Database transactions: Steps in detail

In a typical case, there will be more than one process operating on the database. In order to keep it synchronized, all these processes should be children of the same parent process. A transaction is normally initiated by calling (dbSync) and closed by calling (commit 'upd). For smaller transactions, there are shortcut functions like new! and put!> (exclamation mark by convention) that call (dbSync) and (commit 'upd) implicitly.

A transaction proceeds through the following five phases:

  1. dbSync waits to get a lock on the root object *DB. Other processes continue reading and writing meanwhile.
  2. dbSync calls sync to synchronize with changes from other processes. We hold the shared lock, but other processes may continue reading.
  3. We make modifications to the internal state of external symbols with put>, set>, lose> etc. We - and also other processes - can still read the DB.
  4. We call (commit 'upd). commit obtains an exclusive lock (no more read operations by other processes), writes an optional transaction log, and then all modified symbols. As upd is passed to 'commit', other processes synchronize with these changes.
  5. Finally, all locks are released by 'commit'.

Sources

PicoLisp64 database tutorial
software-lab.de/doc/index.html