libavl
A library for manipulation of AVL trees
Ben Pfaff
Table of Contents
Consider some techniques that can be used to find a particular item in
an ordered data set. Typical methods include sequential searching,
binary searching, digital searches, and hash tables. Sequential
searching is simple, but slow (O(n)). Digital searching
requires that the entire data set be known in advance, and memory
efficient implementations are also slow.
Hash tables are fast (O(1)) for static data sets, but they require
expensive table enlargements if table size can't be predicted in
advance, and they can be wasteful of memory. In addition, it can be
difficult and time-consuming to choose an effective hash function. Some
hash tables variants also make deletion an expensive operation.
Binary search techniques work almost as quickly (O(log(n)) on an ordered
table, or on a binary tree. Plus, binary trees are efficient for
insertion, deletion, and searching, if data are inserted in random. But
if data are inserted in order then ordinary binary searching can
degenerate to a sequential search.
One further advantage of binary trees is that they allow easy iteration
over the data in the tree in sorted order. With hash tables it is
necessary to sort the data before iterating, and after sorting the data
is no longer in hash form.
AVL trees, invented by Russian mathematicians G. M. Adel'son-Velskii and
E. M. Landis, solve this problem by ensuring that, for each node, the
difference in height between its subtrees (the balance factor) is
not greater than 1. This is done by performing a rotation
whenever the balance factor would increase above 1. Somewhat amazingly,
these rotations can be done while retaining O(log(n)) efficiency.
Binary trees, including AVL trees, come in a few varieties: unthreaded,
threaded, and right-threaded. Each of these types has its own
advantages and disadvantages; all three are implemented by libavl. In
theory there could also be left-threaded trees, but in practice these
are rare, and libavl does not implement left-threaded trees.
There are other alternatives to hash tables that have some of the same
properties as AVL trees; for instance, skip lists, red-black trees, 2-3
trees, and splay trees all allow O(log(n)) insertion and deletion. The
main disadvantage of these methods is that their operations are not as
well documented in the literature.
libavl is three libraries in one: an unthreaded AVL tree library, a
threaded AVL tree library, and a right-threaded AVL tree library.
Identifiers in each of these libraries are prefixed by avl_
,
avlt_
, and avltr_
, respectively. The corresponding header
files are `avl.h', `avlt.h', and `avltr.h', and the
functions that they declare are defined in the `.c' files with the
same names.
Threading is a clever method that simplifies binary tree
traversal. Nodes in a unthreaded binary tree that have zero or one
subnodes have two or one null subnode pointers, respectively. In a
threaded binary tree, however, a null left child pointer is used to
point to the node's inorder(1) predecessor; a null right child pointer points to its inorder
successor. In this way, it is always possible to find the next node and
the previous node of a node in a threaded tree, given only a pointer to
the node in question. In an unthreaded tree, this can't be done without
either performing a search of the tree from the top or maintaining a
stack keeping track of the current node's parent nodes.
Advantages of a threaded tree compared to an unthreaded one include:
-
Faster traversal, since no stack need be maintained.
-
Less memory usage during traversal, since no stack need be maintained.
-
Algorithms that require moving forward and backward in the tree during
traversal are much simplified, since this library implements only
forward movement.
-
Greater generality, since one can go from a node to its successor or
predecessor given only the node; no traversal need be in progress.
Some disadvantages of threaded trees are:
-
Slower tree creation, since threads need to be maintained. This can
partly be alleviated by constructing the tree as an unthreaded tree,
then threading it with a special libavl function.
-
In theory, threaded trees need two extra bits per node to indicate
whether each child pointer points to an ordinary node or the node's
successor/predecessor node. In libavl, however, these bits are stored
in a byte that is used for structure alignment padding in unthreaded
binary trees, so no extra storage is used.
A right-threaded binary tree is similar to a threaded binary
tree, but threads are only maintained on the right side of each node.
This allows for traversal to the right (from smallest to largest value)
but not to the left (from largest to smallest value). Right-threaded
trees are convenient when the properties of a threaded tree are
desirable, but traversal in reverse sort order is not necessary. Not
threading the left links saves time in tree maintenance.
Although there exist left-threaded binary trees corresponding to
right-threaded trees, left-threaded binary trees are not implemented by
libavl. An analogous effect can be obtained by simply sorting the tree
in the opposite order.
Most AVL tree functions are implemented in all three libraries, but
threading allows more generality of operation. So, the threaded and
right-threaded libraries offer a few additional functions for finding
the next or previous node from a given node. In addition, they offer
functions for converting trees from threaded or right-threaded
representations to unthreaded, and vice versa.(2)
The following types are defined and used by libavl:
- Data Type: avl_tree
-
- Data Type: avlt_tree
-
- Data Type: avltr_tree
-
These are the data types used to represent an AVL tree. Although they
are defined in the libavl header files, it should never be necessary to
access them directly. Instead, all accesses should take place through
libavl function.
- Data Type: avl_node
-
- Data Type: avlt_node
-
- Data Type: avltr_node
-
These are the data types used to represent individual nodes in an AVL
tree. Similar cautions apply as with
avl_tree
structures.
- Data Type: avl_traverser
-
- Data Type: avlt_traverser
-
- Data Type: avltr_traverser
-
These are the data types used by the
avl_traverse
family of
functions to iterate across the tree. Again, these are opaque
structures.
- Data Type: avl_comparison_func
-
Every AVL tree must have an ordering defined by a function of this
type. It must have the following signature:
int compare (const void *a, const void *b, void *param)
The return value is expected to be like that returned by strcmp
in the standard C library: negative if a < b, zero if
a = b, positive if a > b. param is an
arbitrary value defined by the user when the AVL tree was created.
- Data Type: avl_node_func
-
This is a function called to perform an operation on an AVL node. It
must have the following signature:
void operate (void *data, void *param)
data is the node data and param is an arbitrary user-defined
value set when the AVL tree was created.
- Data Type: avl_copy_func
-
This is a function called to make a new copy of a node's data. It must
have the following signature:
void *copy (void *data, void *param)
The function should return a new copy of data. param is an
arbitrary user-defined value set when the AVL tree was created.
- Macro: AVL_MAX_HEIGHT
-
This macro defines the maximum height of an AVL tree that can be handled
by functions that maintain a stack of nodes descended. The default
value is 32, which allows for AVL trees with up to 5,704,880 nodes. If
the value is increased, then some functions that keep a bitmap of
descended nodes in a single
int
will have to be rewritten. On
the other hand, the default value can be reduced without harm, although
there is little reason to do this.
These functions deal with creation and destruction of AVL trees.
- Function: avl_tree * avl_create (avl_comparison_func compare, void *param)
-
- Function: avlt_tree * avlt_create (avlt_comparison_func compare, void *param)
-
- Function: avltr_tree * avltr_create (avltr_comparison_func compare, void *param)
-
Create a new, empty AVL tree with comparison function compare.
Arbitrary user data param is saved so that it can be passed to all
user callback functions.
- Function: void avl_destroy (avl_tree *tree, avl_node_func free)
-
- Function: void avlt_destroy (avlt_tree *tree, avl_node_func free)
-
- Function: void avltr_destroy (avltr_tree *tree, avl_node_func free)
-
Destroys AVL tree tree, releasing all of its storage. If
free is non-null, then it is called for every node in postorder
before that node is freed.
- Function: void avl_free (avl_tree *tree)
-
- Function: void avlt_free (avlt_tree *tree)
-
- Function: void avltr_free (avltr_tree *tree)
-
Destroys AVL tree tree, releasing all of its storage. The data in
each node is freed with a call to the standard C library function
free
.
- Function: avl_tree * avl_copy (const avl_tree *tree, avl_copy_func copy)
-
- Function: avlt_tree * avl_copy (const avlt_tree *tree, avl_copy_func copy)
-
- Function: avltr_tree * avl_copy (const avltr_tree *tree, avl_copy_func copy)
-
Copies the contents of AVL tree tree into a new AVL tree, and
returns the new tree. If copy is non-null, then it is called to
make a new copy of each node's data; otherwise, the node data is copied
verbatim into the new tree.
- Function: int avl_count (const avl_tree *tree)
-
- Function: int avlt_count (const avlt_tree *tree)
-
- Function: int avltr_count (const avltr_tree *tree)
-
Returns the number of nodes in AVL tree tree.
- Function: void * xmalloc (size_t size)
-
This is not a function defined by the AVL tree library. Instead, it is
a function that the user program can define. It must allocate
size bytes using
malloc
and return it. It can handle
out-of-memory errors however it chooses, but it may not ever return a
null pointer.
If there is an xmalloc
function defined for use by the AVL tree
library, the AVL tree source files (`avl.c', `avlt.c',
`avltr.c') must be compiled with HAVE_XMALLOC
defined.
Otherwise, the AVL libraries will use their internal static
xmalloc
, which handles out-of-memory errors by printing a message
`virtual memory exhausted' to stderr and terminating the program
with exit code EXIT_FAILURE
.
These function insert nodes, delete nodes, or search for nodes in AVL
trees.
- Function: void ** avl_probe (avl_tree *tree, void *data)
-
- Function: void ** avlt_probe (avlt_tree *tree, void *data)
-
- Function: void ** avltr_probe (avltr_tree *tree, void *data)
-
These are the workhorse functions for insertion into AVL trees. They
search AVL tree tree for a node with data matching data. If
found, a pointer to the matching data is returned. Otherwise, a new
node is created for data, and a pointer to that data is returned.
In either case, the pointer returned can be changed by the user, but the
key data used by the tree's comparison must not be changed(3).
It is usually easier to use one of the avl_insert
or
avl_replace
functions instead of avl_probe
directly.
Please note: It's not a particularly good idea to insert a null
pointer as a data item into an AVL tree, because several of the AVL tree
functions return a null pointer to indicate failure. You can sometimes
avoid a problem by using functions that return a pointer to a pointer
instead of a plain pointer. Also be wary of this when you cast an
arithmetic type to a void pointer for insertion--on typical
architectures, 0's become null pointers when you do this.
- Function: void * avl_insert (avl_tree *tree, void *data)
-
- Function: void * avlt_insert (avlt_tree *tree, void *data)
-
- Function: void * avltr_insert (avltr_tree *tree, void *data)
-
If a node with data matching data exists in AVL tree tree,
returns the matching data item. Otherwise, inserts data into
tree and returns a null pointer.
- Function: void avl_force_insert (avl_tree *tree, void *data)
-
- Function: void avlt_force_insert (avlt_tree *tree, void *data)
-
- Function: void avltr_force_insert (avltr_tree *tree, void *data)
-
Inserts data into tree. If a node with data matching
data exists in tree, aborts the program with an assertion
violation. This function is implemented as a macro; if it is used, the
standard C header
assert.h
must also be included. If macro
NDEBUG
is defined when the libavl header is included, this function
is short-circuited to a direct call to avl_insert
, without
checking the return value.
- Function: void * avl_replace (avl_tree *tree, void *data)
-
- Function: void * avlt_replace (avlt_tree *tree, void *data)
-
- Function: void * avltr_replace (avltr_tree *tree, void *data)
-
If a node with data matching data exists in AVL tree tree,
replaces the node's data with data and returns the node's former
contents. Otherwise, inserts data into tree and returns
a null pointer.
- Function: void * avl_delete (avl_tree *tree, const void *data)
-
- Function: void * avlt_delete (avlt_tree *tree, const void *data)
-
- Function: void * avltr_delete (avltr_tree *tree, const void *data)
-
Searches AVL tree tree for a node with data matching data.
If found, the node is deleted and its data is returned. Otherwise,
returns a null pointer.
- Function: void * avl_force_delete (avl_tree *tree, const void *data)
-
- Function: void * avlt_force_delete (avlt_tree *tree, const void *data)
-
- Function: void * avltr_force_delete (avltr_tree *tree, const void *data)
-
Deletes a node with data matching data from AVL tree tree.
If no matching node is found, aborts the program with an assertion
violation. If NDEBUG is declared when the libavl header is included,
this function is short-circuited to a direct call to
avl_delete
,
without checking the return value.
- Function: void * avl_find (avl_tree *tree, const void *data)
-
- Function: void ** avlt_find (avlt_tree *tree, const void *data)
-
- Function: void ** avltr_find (avltr_tree *tree, const void *data)
-
Searches AVL tree tree for a node with data matching data,
If found, returns the node's data (for threaded and right-threaded
trees, a pointer to the node's data). Otherwise, returns a null
pointer.
These functions allow the caller to iterate across the items in an AVL
tree.
- Function: void avl_walk (const avl_tree *tree, avl_node_func operate, void *param)
-
- Function: void avlt_walk (const avlt_tree *tree, avl_node_func operate, void *param)
-
- Function: void avltr_walk (const avltr_tree *tree, avl_node_func operate, void *param)
-
Walks through all the nodes in AVL tree tree in inorder, and calls
function operate for each node. param overrides the value
passed to
avl_create
for this operation only. operate must
not change the key data in the nodes in a way that would reorder the
data values or cause two values to be equal.
- Function: void * avl_traverse (const avl_tree *tree, avl_traverser *trav)
-
- Function: void * avlt_traverse (const avlt_tree *tree, avlt_traverser *trav)
-
- Function: void * avltr_traverse (const avltr_tree *tree, avltr_traverser *trav)
-
Returns each of AVL tree tree's nodes' data values in sequence,
then a null pointer to indicate the last item. trav must be
initialized to 0 before the first call in a declaration like this:
avl_traverser trav = {0};
Each avl_traverser
is a separate, independent iterator.
For threaded and right-threaded trees, avlt_next
or
avltr_next
, respectively, are faster and more memory-efficient
than avlt_traverse
or avltr_traverse
.
- Function: void ** avlt_next (const avlt_tree *tree, void **data)
-
- Function: void ** avltr_next (const avltr_tree *tree, void **data)
-
data must be a null pointer or a pointer to a data item in AVL
tree tree. Returns a pointer to the next data item after
data in tree in inorder (this is the first item if
data is a null pointer), or a null pointer if data was the
last item in tree.
- Function: void ** avltr_prev (const avltr_tree *tree, void **data)
-
data must be a null pointer or a pointer to a data item in AVL
tree tree. Returns a pointer to the previous data item before
data in tree in inorder (this is the last, or greatest
valued, item if data is a null pointer), or a null pointer if
data was the first item in tree.
- Function: avlt_tree * avlt_thread (avl_tree *tree)
-
- Function: avltr_tree * avltr_thread (avl_tree *tree)
-
Adds symmetric threads or just right threads to unthreaded AVL tree
tree and returns a pointer to tree cast to the appropriate
type. After one of these functions is called, threaded or
right-threaded functions, as appropriate, must be used with tree;
unthreaded functions may not be used.
- Function: avl_tree * avlt_unthread (avlt_tree *tree)
-
- Function: avl_tree * avltr_unthread (avltr_tree *tree)
-
Removes all threads from threaded or right-threaded AVL tree tree
and returns a pointer to tree cast to
avl_tree *
. After
one of these functions is called, unthreaded function must be used with
tree; threaded or right-threaded functions may not be used.
libavl was written by Ben Pfaff, based on algorithms from Donald Knuth's
venerable Art of Computer Programming series from Addison-Wesley,
primarily Volumes 1 and 3.
Ben can be contacted at blp@gnu.org.
Jump to:
a
-
b
-
h
-
k
-
l
-
p
-
r
-
t
-
u
-
x
Adel'son-Velskii, G. M.
Art of Computer Programming
author
avl_comparison_func
avl_copy, avl_copy, avl_copy
avl_copy_func
avl_count
avl_create
avl_delete
avl_destroy
avl_find
avl_force_delete
avl_force_insert
avl_free
avl_insert
AVL_MAX_HEIGHT
avl_node
avl_node_func
avl_probe
avl_replace
avl_traverse
avl_traverser
avl_tree
avl_walk
avlt_count
avlt_create
avlt_delete
avlt_destroy
avlt_find
avlt_force_delete
avlt_force_insert
avlt_free
avlt_insert
avlt_next
avlt_node
avlt_probe
avlt_replace
avlt_thread
avlt_traverse
avlt_traverser
avlt_tree
avlt_unthread
avlt_walk
avltr_count
avltr_create
avltr_delete
avltr_destroy
avltr_find
avltr_force_delete
avltr_force_insert
avltr_free
avltr_insert
avltr_next
avltr_node
avltr_prev
avltr_probe
avltr_replace
avltr_thread
avltr_traverse
avltr_traverser
avltr_tree
avltr_unthread
avltr_walk
binary tree
hash table
Knuth, Donald Ervin
Landis, E. M.
left threads
Pfaff, Benjamin Levy
right threads, right threads
threads, threads
unthreaded
xmalloc
Footnotes
In tree traversal, inorder
refers to visiting the nodes in their sorted order from smallest to
largest.
In general, you
should build the sort of tree that you need to use, but occasionally it
is useful to convert between tree types.
It
can be changed if this would not change the ordering of the nodes in the
tree; i.e., if this would not cause the data in the node to be less than
or equal to the previous node's data or greater than or equal to the
next node's data.
This document was generated on 11 May 1999 using
texi2html 1.56k.