Common Lisp Package: DARTS.LIB.HASHTREE

Purely functional hash-based map structure This package provides a purely functional data structure for mapping keys to values. The underlying algorithms are hash-based, and modelled after the paper ``Ideal Hash Trees┬┤┬┤ by Phil Bagwell. A few differences to Common Lisp's standard hash tables: - all objects are immutable after construction - you can specify any equality test predicate you need - works with any user supplied hash function By being immutable, these structures are automatically thread-safe and can be shared across any number of concurrently running threads safely. That fact was actually the main motivation for the development of this package. Note, that hash trees as implemented here are not necessarily a simple drop-in replacement for Common Lisp's standard hash tables. In particular, hash trees work best with equal based equality (as opposed to eq). Also, many Lisp implementations provide weak hash tables, which is not supported by hash trees at all.

README:

Functional Maps and Trees

This project provides a few purely functional data structures. Right now, it provides:

  • Hash Trees: purely functional map, based on hashing. The implementation is derived from Phil Bagwell's paper "Ideal Hash Trees"

  • Property Tree: a purely functional (sorted) map from strings to arbitrary values. The implementation is derived from S. Adams' paper "Implementing Sets Efficiently in a Functional Language"

The tree structures provided by this project are immutable, i.e., cannot be modified once they have been constructed. Mutation operations return newly constructed tree instances (which may actually share state with the original tree, to which the operation was applied).

Hash Tree

A hash tree is a trie structure indexed by the hash codes of the keys. See the paper for details on the algorithm. Hash trees are provided by the package DARTS.LIB.HASHTREE.

Unlike Common Lisp's built-in hash tables, hash trees can use any equivalence predicate and hash function you care to provide. The default equivalence predicate is eql, and the default hash function is sxhash.

  • make-hashtree &key test hash =>tree Creates and returns a new hash tree, which uses the given equivalence predicate test (defaults to eql) hash function hash (defaults to sxhash).

  • hashtree-count tree =>number Returns the number of entries in the given hashtree tree

  • hashtree-hash tree =>function Returns the hash function used by the given hashtree.

  • hashtree-test tree =>function Returns the equivalence predicate used by tree.

  • hashtree-map function tree Calls function for each entry in tree. The function must accept two arguments, the entry's key value as first, and the entry's associated value as second argument. hashtree-map always returns nil.

  • hashtree-keys tree =>list Returns a list containing all key values present in the given tree, in no particular order.

  • hashtree-values tree =>list Returns a list containing all associated values present in the given tree, in no particular order.

  • hashtree-pairs tree =>list Returns a list of pairs (key . value) containing all entries of tree.

  • do-hashtree (key value) tree &body body Iterates over all entries of the hashtree, which is the result of evaluating tree. For each entry, binds key to the entry's key and value to its associated value, then evaluates the forms in body as progn would do. The result of this macro is undefined.

  • hashtree-empty-p tree =>flag Tests, whether tree is empty

  • hashtreep value =>flag Tests, whether value is a hash tree.

  • hashtree-get key tree &optional default =>value, indicator Searches for the value associated with key in tree. If found, returns it as primary result, and t as second result. If no matching entry exists, returns default as primary result, and nil as the secondary.

  • hashtree-remove key tree =>new-tree, indicator Removes the entry for key from tree, returning the resulting new hash tree as primary return value. The secondary return value is a boolean flag indicating, whether a matching entry was found (and removed), or not.

  • hashtree-update key value tree =>new-tree, action Inserts a key/value pair or replaces the value in an existing pair in tree. Returns the new tree as primary return value. The secondary value is an indicator of what was done by the call. Possible values are :added if no matching key was present prior to the call, and :replaced, if the value of an existing entry was replaced by the new value value.

  • hashtree-fold function seed tree =>value

Property Trees

A property tree is a weight-balanced binary search tree. Right now, the implementation only supports string designators as keys. The algorithms are easily generalizable, but I could not yet decide on an interface for that, and string keys were the only ones, I needed up to now.

Since property trees are actually ordered, all iteration functions (like ptree-map, ptree-fold, ...) iterate over the entries of a property tree in proper key order. Likewise, collector functions like ptree-keys, ptree-values, ... will always yield the elements in proper key order.

  • ptreep value =>flag
  • empty-ptree-p tree =>flag
  • ptree-size tree =>integer
  • ptree-key tree =>value
  • ptree-value tree =>value
  • ptree-left tree =>ptree
  • ptree-right tree =>ptree
  • ptree-minimum tree =>ptree
  • ptree-maximum tree =>ptree
  • ptree-smallest key tree =>ptree
  • ptree-largest key tree =>ptree
  • ptree-get key tree &optional default =>value, flag
  • ptree-insert key value tree &optional test =>ptree, flag
  • ptree-update key tree mutator =>ptree, flag
  • ptree-remove key tree =>ptree, flag
  • ptree-map function tree &key direction collectp start end =>value
  • ptree-fold function seed tree &key direction start end =>value
  • ptree-pairs tree =>list
  • ptree-keys tree =>list
  • ptree-values tree =>list
  • ptree-intersection tree1 tree2 =>ptree
  • ptree-union tree1 tree2 =>ptree
  • ptree-difference tree1 tree2 =>ptree
  • ptree-iterator tree =>function
  • ptree-equal tree1 tree2 =>boolean
  • ptree &rest pairs =>ptree

Future Plans

  • Generalization of the property tree code to arbitrary <-style comparison predicates

  • Homogenization of the hashtree and ptree interfaces (e.g., ptree-update vs. ptree-insert vs. hashtree-update)

FUNCTION

Public

HASHTREE (&REST ARGS)

hashtree &rest ARGS => TREE Constructs a new hash tree using the default test and hash function pair (i.e., eql and sxhash). Remaining arguments are used to initialize the new hash tree; an even number of arguments must be supplied. The arguments on positions 2k (with k = 0, 1, ...) are taken as the keys and the values at positions 2k + 1 (with k = 0, 1, ...) are used as the associated values. Example: (hashtree-get 2 (hashtree 1 :one 2 :two 3 :three)) ==> :TWO ==> T This function is provided as convenience constructor for hash trees using the standard hash control functions.

HASHTREE-COUNT (MAP)

Counts the number of entries in a hash tree. Note, that this function has a runtime complexity of `O(n)`, with `n` being the number of key/value pairs in MAP.

HASHTREE-EMPTY-P (MAP)

Tests, whether the given hash tree MAP is empty.

HASHTREE-FOLD (FUNCTION INITIAL-VALUE MAP)

Reduces a hash tree into a single summary value. The function FUNCTION must have the signature (lambda (key value summary) ...). It is called for each key/value pair present in MAP. On the first invocation, the summary value will be INITIAL-VALUE, and on each subsequent invocation it will be the primary return value of FUNCTION from the previous invocation. After all pairs have been processed, that last return value of FUNCTION is returned. If MAP is the empty tree, then this function returns INITIAL-VALUE. Note, that the order, in which this function visits the key/value pairs in MAP, is undefined.

HASHTREE-GET

Obtains the value associated with a given key KEY in MAP. If no matching association exists, returns DEFAULT instead. This function returns as secondary value a boolean, which indicates, whether a matching key/value pair was found (T) or not (NIL).

HASHTREE-HASH (MAP)

Obtains the hash function used by the given hash tree MAP.

HASHTREE-KEYS (MAP)

Computes a list of the keys of all key/value pairs in MAP. Note, that the order of the values in the resulting list is undefined.

HASHTREE-MAP (FUNCTION MAP)

Maps a function across all elements in a hashtree. This function applies FUNCTION to each element in MAP; FUNCTION must accept two arguments and will be called with an element's key as first, and its associated value as second argument. The result of this function is undefined. Note, that the order, in which this function visits the key/value pairs in MAP, is undefined.

HASHTREE-PAIRS (MAP)

Computes a list of all key/value pairs in MAP. The result is a list of conses `(key . value)`. Note, that the order of the values in the resulting list is undefined.

HASHTREE-REMOVE

Removes the association of KEY from hash tree MAP. Returns a copy of MAP without the key/value pair. No guarantees are made as to whether this function returns MAP unchanged or a copy of MAP, if no matching pair is found; the only guarantee made is, that the resulting value will be equivalent to MAP. This function returns as a secondary value a boolean flag, which indicates, whether KEY was found and subsequently removed (T) or not (NIL).

HASHTREE-TEST (MAP)

Obtains the test function used by the given hash tree MAP.

HASHTREE-UPDATE

Updates the hash tree MAP, adding an association of KEY to VALUE. Any previous association of KEY in MAP is replaced. This function returns a copy of MAP with the new association as primary value. The secondary value is one of :added there was no previous association of KEY in MAP :replaced a former association of KEY has been replaced.

HASHTREE-VALUES (MAP)

Computes a list of the values of all key/value pairs in MAP. Note, that the order of the values in the resulting list is undefined.

MAKE-HASHTREE (&KEY (TEST #'EQL) (HASH #'SXHASH))

Creates a new hash tree. The hash tree will use the function supplied as the value of the :test parameter as its equality test predicate, and the value passed as :hash as its hash function. If omitted, the default equality test is eql, and the default hash function is sxhash.

Undocumented

HASHTREEP (OBJECT)

Private

HASH-CONTROL-%EMPTY (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

HASH-CONTROL-HASH (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

HASH-CONTROL-TEST (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

HASHTREE-CONTROL (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

HASHTREE-NODE (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

LEAF-BUCKETS (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

LEAF-HASH (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

MAKE-HASH-CONTROL (TEST HASH)

Creates a new "hash controller". A hash controller is basically a pair of two functions `test` and `hash`. The `test` function is a predicate `(lambda (a b) ...) => boolean`, which tests two values `a` and `b` for equality. The default `test` function is `eql`. The `hash` function has the signature `(lambda (x) ...) => integer`. It is called in order to compute a hash value for the given value `x`. The default hash function is `sxhash`.

SUBTABLE-MASK (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

SUBTABLE-VECTOR (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

Undocumented

%MAKE-HASH-CONTROL (TEST HASH)

%MAKE-HASHTREE (CONTROL &OPTIONAL (NODE +EMPTY+))

COMPUTE-HASH (VALUE FN)

COPY-LEAF (INSTANCE)

COPY-SUBTABLE (INSTANCE)

EMPTYP (OBJECT)

SETFHASH-CONTROL-%EMPTY (NEW-VALUE INSTANCE)

HASH-CONTROL-P (OBJECT)

LEAF-P (OBJECT)

MAKE-EMPTY (&KEY)

MAKE-LEAF (HASH BUCKETS)

MAKE-NODE (&KEY)

MAKE-SUBTABLE (MASK VECTOR)

NODE-P (OBJECT)

SUBTABLE-P (OBJECT)

MACRO

Public

DEFINE-HASHTREE-CONSTRUCTOR (NAME &KEY (TEST '#'EQL) (HASH '#'SXHASH))

Defines a factory function for hash trees using a dedicated pair of equality predicate TEST and hash function HASH. The forms TEST and HASH are evaluated and must yield funcallable values (i.e., symbols or functions). The primary effect of this macro is the definition of a new function NAME, which has the signature (lambda (&rest args) ...). It must be called with an even number of arguments. The arguments at indices 2k (for k in 0, 1, ...) are taken to be the keys, and arguments at indices 2k + 1 are the associated values. The result of calling NAME is a hash tree containing the key/value pairs passed as arguments. Example: > (define-hashtree-constructor integer-tree :test #'= :hash #'identity) INTEGER-TREE > (integer-tree 1 :one 2 :two 3 :three) #<HASHTREE ...> > (hashtree-get 2 *) :TWO T

DO-HASHTREE ((KEY VALUE) TREE &BODY BODY)

Evaluates the form TREE, which must yield a hash tree instance. Visits each key/value pair in that tree, binding the variable named KEY to the pair's key, and VALUE to the pair's value, and evaluates the forms in BODY like progn does. There is an implicit anonymous block surrounding the expansion of this form, which may be used to stop the iteration before all elements have been visited. The result value of this form is nil, unless an explicit result value is specified in BODY by doing a `return`.

VARIABLE

Private

Undocumented

*SIMPLE-HASHTREE-CONTROLLER*

+EMPTY+

+ROOT+

CLASS

Public

Undocumented

HASHTREE (&REST ARGS)

Private

Undocumented

EMPTY

HASH-CONTROL

LEAF

NODE

SUBTABLE