libavl

A library for manipulation of AVL trees

Ben Pfaff


Table of Contents


Introduction to AVL trees

Consider some techniques that can be used to find a particular item in an ordered data set. Typical methods include sequential searching, binary searching, digital searches, and hash tables. Sequential searching is simple, but slow (O(n)). Digital searching requires that the entire data set be known in advance, and memory efficient implementations are also slow.

Hash tables are fast (O(1)) for static data sets, but they require expensive table enlargements if table size can't be predicted in advance, and they can be wasteful of memory. In addition, it can be difficult and time-consuming to choose an effective hash function. Some hash tables variants also make deletion an expensive operation.

Binary search techniques work almost as quickly (O(log(n)) on an ordered table, or on a binary tree. Plus, binary trees are efficient for insertion, deletion, and searching, if data are inserted in random. But if data are inserted in order then ordinary binary searching can degenerate to a sequential search.

One further advantage of binary trees is that they allow easy iteration over the data in the tree in sorted order. With hash tables it is necessary to sort the data before iterating, and after sorting the data is no longer in hash form.

AVL trees, invented by Russian mathematicians G. M. Adel'son-Velskii and E. M. Landis, solve this problem by ensuring that, for each node, the difference in height between its subtrees (the balance factor) is not greater than 1. This is done by performing a rotation whenever the balance factor would increase above 1. Somewhat amazingly, these rotations can be done while retaining O(log(n)) efficiency.

Binary trees, including AVL trees, come in a few varieties: unthreaded, threaded, and right-threaded. Each of these types has its own advantages and disadvantages; all three are implemented by libavl. In theory there could also be left-threaded trees, but in practice these are rare, and libavl does not implement left-threaded trees.

There are other alternatives to hash tables that have some of the same properties as AVL trees; for instance, skip lists, red-black trees, 2-3 trees, and splay trees all allow O(log(n)) insertion and deletion. The main disadvantage of these methods is that their operations are not as well documented in the literature.

Introduction to libavl

libavl is three libraries in one: an unthreaded AVL tree library, a threaded AVL tree library, and a right-threaded AVL tree library. Identifiers in each of these libraries are prefixed by avl_, avlt_, and avltr_, respectively. The corresponding header files are `avl.h', `avlt.h', and `avltr.h', and the functions that they declare are defined in the `.c' files with the same names.

Threading is a clever method that simplifies binary tree traversal. Nodes in a unthreaded binary tree that have zero or one subnodes have two or one null subnode pointers, respectively. In a threaded binary tree, however, a null left child pointer is used to point to the node's inorder(1) predecessor; a null right child pointer points to its inorder successor. In this way, it is always possible to find the next node and the previous node of a node in a threaded tree, given only a pointer to the node in question. In an unthreaded tree, this can't be done without either performing a search of the tree from the top or maintaining a stack keeping track of the current node's parent nodes.

Advantages of a threaded tree compared to an unthreaded one include:

Some disadvantages of threaded trees are:

A right-threaded binary tree is similar to a threaded binary tree, but threads are only maintained on the right side of each node. This allows for traversal to the right (from smallest to largest value) but not to the left (from largest to smallest value). Right-threaded trees are convenient when the properties of a threaded tree are desirable, but traversal in reverse sort order is not necessary. Not threading the left links saves time in tree maintenance.

Although there exist left-threaded binary trees corresponding to right-threaded trees, left-threaded binary trees are not implemented by libavl. An analogous effect can be obtained by simply sorting the tree in the opposite order.

Most AVL tree functions are implemented in all three libraries, but threading allows more generality of operation. So, the threaded and right-threaded libraries offer a few additional functions for finding the next or previous node from a given node. In addition, they offer functions for converting trees from threaded or right-threaded representations to unthreaded, and vice versa.(2)

Types

The following types are defined and used by libavl:

Data Type: avl_tree
Data Type: avlt_tree
Data Type: avltr_tree
These are the data types used to represent an AVL tree. Although they are defined in the libavl header files, it should never be necessary to access them directly. Instead, all accesses should take place through libavl function.

Data Type: avl_node
Data Type: avlt_node
Data Type: avltr_node
These are the data types used to represent individual nodes in an AVL tree. Similar cautions apply as with avl_tree structures.

Data Type: avl_traverser
Data Type: avlt_traverser
Data Type: avltr_traverser
These are the data types used by the avl_traverse family of functions to iterate across the tree. Again, these are opaque structures.

Data Type: avl_comparison_func
Every AVL tree must have an ordering defined by a function of this type. It must have the following signature:
int compare (const void *a, const void *b, void *param)

The return value is expected to be like that returned by strcmp in the standard C library: negative if a < b, zero if a = b, positive if a > b. param is an arbitrary value defined by the user when the AVL tree was created.

Data Type: avl_node_func
This is a function called to perform an operation on an AVL node. It must have the following signature:
void operate (void *data, void *param)

data is the node data and param is an arbitrary user-defined value set when the AVL tree was created.

Data Type: avl_copy_func

This is a function called to make a new copy of a node's data. It must have the following signature:

void *copy (void *data, void *param)

The function should return a new copy of data. param is an arbitrary user-defined value set when the AVL tree was created.

Macro: AVL_MAX_HEIGHT
This macro defines the maximum height of an AVL tree that can be handled by functions that maintain a stack of nodes descended. The default value is 32, which allows for AVL trees with up to 5,704,880 nodes. If the value is increased, then some functions that keep a bitmap of descended nodes in a single int will have to be rewritten. On the other hand, the default value can be reduced without harm, although there is little reason to do this.

Tree Creation

These functions deal with creation and destruction of AVL trees.

Function: avl_tree * avl_create (avl_comparison_func compare, void *param)
Function: avlt_tree * avlt_create (avlt_comparison_func compare, void *param)
Function: avltr_tree * avltr_create (avltr_comparison_func compare, void *param)
Create a new, empty AVL tree with comparison function compare. Arbitrary user data param is saved so that it can be passed to all user callback functions.

Function: void avl_destroy (avl_tree *tree, avl_node_func free)
Function: void avlt_destroy (avlt_tree *tree, avl_node_func free)
Function: void avltr_destroy (avltr_tree *tree, avl_node_func free)
Destroys AVL tree tree, releasing all of its storage. If free is non-null, then it is called for every node in postorder before that node is freed.

Function: void avl_free (avl_tree *tree)
Function: void avlt_free (avlt_tree *tree)
Function: void avltr_free (avltr_tree *tree)
Destroys AVL tree tree, releasing all of its storage. The data in each node is freed with a call to the standard C library function free.

Function: avl_tree * avl_copy (const avl_tree *tree, avl_copy_func copy)
Function: avlt_tree * avl_copy (const avlt_tree *tree, avl_copy_func copy)
Function: avltr_tree * avl_copy (const avltr_tree *tree, avl_copy_func copy)
Copies the contents of AVL tree tree into a new AVL tree, and returns the new tree. If copy is non-null, then it is called to make a new copy of each node's data; otherwise, the node data is copied verbatim into the new tree.

Function: int avl_count (const avl_tree *tree)
Function: int avlt_count (const avlt_tree *tree)
Function: int avltr_count (const avltr_tree *tree)
Returns the number of nodes in AVL tree tree.

Function: void * xmalloc (size_t size)
This is not a function defined by the AVL tree library. Instead, it is a function that the user program can define. It must allocate size bytes using malloc and return it. It can handle out-of-memory errors however it chooses, but it may not ever return a null pointer.

If there is an xmalloc function defined for use by the AVL tree library, the AVL tree source files (`avl.c', `avlt.c', `avltr.c') must be compiled with HAVE_XMALLOC defined. Otherwise, the AVL libraries will use their internal static xmalloc, which handles out-of-memory errors by printing a message `virtual memory exhausted' to stderr and terminating the program with exit code EXIT_FAILURE.

Insertion and Deletion

These function insert nodes, delete nodes, or search for nodes in AVL trees.

Function: void ** avl_probe (avl_tree *tree, void *data)
Function: void ** avlt_probe (avlt_tree *tree, void *data)
Function: void ** avltr_probe (avltr_tree *tree, void *data)
These are the workhorse functions for insertion into AVL trees. They search AVL tree tree for a node with data matching data. If found, a pointer to the matching data is returned. Otherwise, a new node is created for data, and a pointer to that data is returned. In either case, the pointer returned can be changed by the user, but the key data used by the tree's comparison must not be changed(3).

It is usually easier to use one of the avl_insert or avl_replace functions instead of avl_probe directly.

Please note: It's not a particularly good idea to insert a null pointer as a data item into an AVL tree, because several of the AVL tree functions return a null pointer to indicate failure. You can sometimes avoid a problem by using functions that return a pointer to a pointer instead of a plain pointer. Also be wary of this when you cast an arithmetic type to a void pointer for insertion--on typical architectures, 0's become null pointers when you do this.

Function: void * avl_insert (avl_tree *tree, void *data)
Function: void * avlt_insert (avlt_tree *tree, void *data)
Function: void * avltr_insert (avltr_tree *tree, void *data)
If a node with data matching data exists in AVL tree tree, returns the matching data item. Otherwise, inserts data into tree and returns a null pointer.

Function: void avl_force_insert (avl_tree *tree, void *data)
Function: void avlt_force_insert (avlt_tree *tree, void *data)
Function: void avltr_force_insert (avltr_tree *tree, void *data)
Inserts data into tree. If a node with data matching data exists in tree, aborts the program with an assertion violation. This function is implemented as a macro; if it is used, the standard C header assert.h must also be included. If macro NDEBUG is defined when the libavl header is included, this function is short-circuited to a direct call to avl_insert, without checking the return value.

Function: void * avl_replace (avl_tree *tree, void *data)
Function: void * avlt_replace (avlt_tree *tree, void *data)
Function: void * avltr_replace (avltr_tree *tree, void *data)
If a node with data matching data exists in AVL tree tree, replaces the node's data with data and returns the node's former contents. Otherwise, inserts data into tree and returns a null pointer.

Function: void * avl_delete (avl_tree *tree, const void *data)
Function: void * avlt_delete (avlt_tree *tree, const void *data)
Function: void * avltr_delete (avltr_tree *tree, const void *data)
Searches AVL tree tree for a node with data matching data. If found, the node is deleted and its data is returned. Otherwise, returns a null pointer.

Function: void * avl_force_delete (avl_tree *tree, const void *data)
Function: void * avlt_force_delete (avlt_tree *tree, const void *data)
Function: void * avltr_force_delete (avltr_tree *tree, const void *data)
Deletes a node with data matching data from AVL tree tree. If no matching node is found, aborts the program with an assertion violation. If NDEBUG is declared when the libavl header is included, this function is short-circuited to a direct call to avl_delete, without checking the return value.

Function: void * avl_find (avl_tree *tree, const void *data)
Function: void ** avlt_find (avlt_tree *tree, const void *data)
Function: void ** avltr_find (avltr_tree *tree, const void *data)
Searches AVL tree tree for a node with data matching data, If found, returns the node's data (for threaded and right-threaded trees, a pointer to the node's data). Otherwise, returns a null pointer.

Iteration

These functions allow the caller to iterate across the items in an AVL tree.

Function: void avl_walk (const avl_tree *tree, avl_node_func operate, void *param)
Function: void avlt_walk (const avlt_tree *tree, avl_node_func operate, void *param)
Function: void avltr_walk (const avltr_tree *tree, avl_node_func operate, void *param)
Walks through all the nodes in AVL tree tree in inorder, and calls function operate for each node. param overrides the value passed to avl_create for this operation only. operate must not change the key data in the nodes in a way that would reorder the data values or cause two values to be equal.

Function: void * avl_traverse (const avl_tree *tree, avl_traverser *trav)
Function: void * avlt_traverse (const avlt_tree *tree, avlt_traverser *trav)
Function: void * avltr_traverse (const avltr_tree *tree, avltr_traverser *trav)
Returns each of AVL tree tree's nodes' data values in sequence, then a null pointer to indicate the last item. trav must be initialized to 0 before the first call in a declaration like this:
avl_traverser trav = {0};

Each avl_traverser is a separate, independent iterator.

For threaded and right-threaded trees, avlt_next or avltr_next, respectively, are faster and more memory-efficient than avlt_traverse or avltr_traverse.

Function: void ** avlt_next (const avlt_tree *tree, void **data)
Function: void ** avltr_next (const avltr_tree *tree, void **data)
data must be a null pointer or a pointer to a data item in AVL tree tree. Returns a pointer to the next data item after data in tree in inorder (this is the first item if data is a null pointer), or a null pointer if data was the last item in tree.

Function: void ** avltr_prev (const avltr_tree *tree, void **data)
data must be a null pointer or a pointer to a data item in AVL tree tree. Returns a pointer to the previous data item before data in tree in inorder (this is the last, or greatest valued, item if data is a null pointer), or a null pointer if data was the first item in tree.

Conversion

Function: avlt_tree * avlt_thread (avl_tree *tree)
Function: avltr_tree * avltr_thread (avl_tree *tree)
Adds symmetric threads or just right threads to unthreaded AVL tree tree and returns a pointer to tree cast to the appropriate type. After one of these functions is called, threaded or right-threaded functions, as appropriate, must be used with tree; unthreaded functions may not be used.

Function: avl_tree * avlt_unthread (avlt_tree *tree)
Function: avl_tree * avltr_unthread (avltr_tree *tree)
Removes all threads from threaded or right-threaded AVL tree tree and returns a pointer to tree cast to avl_tree *. After one of these functions is called, unthreaded function must be used with tree; threaded or right-threaded functions may not be used.

Author

libavl was written by Ben Pfaff, based on algorithms from Donald Knuth's venerable Art of Computer Programming series from Addison-Wesley, primarily Volumes 1 and 3.

Ben can be contacted at blp@gnu.org.

Index

Jump to: a - b - h - k - l - p - r - t - u - x

a

  • Adel'son-Velskii, G. M.
  • Art of Computer Programming
  • author
  • avl_comparison_func
  • avl_copy, avl_copy, avl_copy
  • avl_copy_func
  • avl_count
  • avl_create
  • avl_delete
  • avl_destroy
  • avl_find
  • avl_force_delete
  • avl_force_insert
  • avl_free
  • avl_insert
  • AVL_MAX_HEIGHT
  • avl_node
  • avl_node_func
  • avl_probe
  • avl_replace
  • avl_traverse
  • avl_traverser
  • avl_tree
  • avl_walk
  • avlt_count
  • avlt_create
  • avlt_delete
  • avlt_destroy
  • avlt_find
  • avlt_force_delete
  • avlt_force_insert
  • avlt_free
  • avlt_insert
  • avlt_next
  • avlt_node
  • avlt_probe
  • avlt_replace
  • avlt_thread
  • avlt_traverse
  • avlt_traverser
  • avlt_tree
  • avlt_unthread
  • avlt_walk
  • avltr_count
  • avltr_create
  • avltr_delete
  • avltr_destroy
  • avltr_find
  • avltr_force_delete
  • avltr_force_insert
  • avltr_free
  • avltr_insert
  • avltr_next
  • avltr_node
  • avltr_prev
  • avltr_probe
  • avltr_replace
  • avltr_thread
  • avltr_traverse
  • avltr_traverser
  • avltr_tree
  • avltr_unthread
  • avltr_walk
  • b

  • binary tree
  • h

  • hash table
  • k

  • Knuth, Donald Ervin
  • l

  • Landis, E. M.
  • left threads
  • p

  • Pfaff, Benjamin Levy
  • r

  • right threads, right threads
  • t

  • threads, threads
  • u

  • unthreaded
  • x

  • xmalloc

  • Footnotes

    (1)

    In tree traversal, inorder refers to visiting the nodes in their sorted order from smallest to largest.

    (2)

    In general, you should build the sort of tree that you need to use, but occasionally it is useful to convert between tree types.

    (3)

    It can be changed if this would not change the ordering of the nodes in the tree; i.e., if this would not cause the data in the node to be less than or equal to the previous node's data or greater than or equal to the next node's data.


    This document was generated on 11 May 1999 using texi2html 1.56k.