Title: Binary Search Trees
1Binary Search Trees
Keith Schwarz
Julie Zelenski
Eric Roberts CS 106B November 6, 2009
2Outline
3Limitations of Hash Tables
- The hash table strategy we introduced a week ago
is by far the most common implementation of maps
in practice. - Despite its many advantagesmost notably its
extraordinary efficiencythe hash table
implementation is not necessarily the best
strategy for all applications. - The hash table strategy has the following
limitations - Hash tables depend on being able to compute a
hash function on some key. Expanding the
hash-function idea so that it applies to types
other than strings can be subtle. - The iterator for a map based on hash tables does
not generate its values in any kind of order.
Even when the keys have a natural order (such as
the lexicographic order used with strings), a
hash table iterator cannot take advantage of that
fact.
4Binary Search in a Sorted Array
- Instead of using a map, suppose that you entered
all the state abbreviations (along with their
names) in an array so that the abbreviations were
arranged in alphabetical order
- How many operations (as a function of the size N)
are needed to find a state abbreviation, such as
"GA"?
- The lookup operation in a sorted array runs in
O(log N) time.
5Insertion in a Sorted Array
- But what about adding a new key? Suppose, for
example, that Puerto Rico (PR) becomes a state
- All the cells at the end of the array need to
move to make room.
- The add operation in a sorted array runs in O(N)
time.
6The Problem of Finding the Middle
- When we worked with the editor buffer two weeks
ago, we solved the insertion problem by using a
linked list instead of an array. - Unfortunately, turning a sorted array into a
linked list makes it impossible to apply binary
search because there is no way to find the middle
element. - But what if you could point to the middle element
in a linked list? That idea seems unlikely, but
it is the key to finding a data structure that
offers O(log N) performance for both the lookup
and add operations.
7Binary Search Trees
- The structure that ends up solving this problem
is called a binary search tree (or BST for
short). Each node in a BST has exactly two
subtrees a left subtree that contains all the
nodes that come before the current node and a
right subtree that contains all the nodes that
come after it. Either or both of these subtrees
may be NULL. - The classic example of a binary search tree uses
the names from Walt Disneys Snow White and the
Seven Dwarves
8A Simple BST Implementation
- To get a sense of how binary search trees work,
it is useful to start with a simple design in
which keys are always strings. - Each node in the tree is then a structure
containing a key and two subtrees, each of which
is either NULL or a pointer to some other node.
This design suggests the following type
definition
struct nodeT string key nodeT left,
right
- The code for finding a node in a tree begins by
comparing the desired key with the key in the
root node. If the string match, youve found the
correct node if not, you simply call yourself
recursively on the left or right subtree
depending on whether the key you want comes
before or after the current one.
9A Simple BST Implementation
/ Function FindNode Usage nodeT node
FindNode(t, key) -----------------------------
--------- Finds a node with the specified key
in the binary search tree rooted at t. If a
node matching key appears in the tree, FindNode
returns a pointer to that node. If the key
does not appear, FindNode returns NULL.
/ nodeT FindNode(nodeT t, string key) if
(t NULL) return NULL if (key t-gtkey)
return t if (key lt t-gtkey) return
FindNode(t-gtleft, key) else return
FindNode(t-gtright, key)
10A Simple BST Implementation
/ Function FindNode Usage nodeT node
FindNode(t, key) -----------------------------
--------- Finds a node with the specified key
in the binary search tree rooted at t. If a
node matching key appears in the tree, FindNode
returns a pointer to that node. If the key
does not appear, FindNode returns NULL.
/ nodeT FindNode(nodeT t, string key) if
(t NULL) return NULL if (key t-gtkey)
return t if (key lt t-gtkey) return
FindNode(t-gtleft, key) else return
FindNode(t-gtright, key)
11Exercise Building a Binary Search Tree
Diagram the BST that results from executing the
following code
nodeT colors NULL InsertNode(colors,
"red") InsertNode(colors, "orange") InsertNode(c
olors, "yellow") InsertNode(colors,
"green") InsertNode(colors, "blue") InsertNode(c
olors, "indigo") InsertNode(colors, "violet")
12Traversal Strategies
- It is easy to write a function that performs some
operation for every key in a binary search tree,
because recursion makes it simple to apply that
operation to each of the subtrees. - The order in which keys are processed depends on
when you process the current node with respect to
the recursive calls - If you process the current node before either
recursive call, the result is a preorder
traversal. - If you process the current node after the
recursive call on the left subtree but before the
recursive call on the right subtree, the result
is an inorder traversal. In the case of the
simple BST implementation that uses strings as
keys, the keys will appear in lexicographic
order. - If you process the current node after completing
both recursive calls, the result is a postorder
traversal. Postorder traversals are particularly
useful if you are trying to free all the nodes in
a tree.
13Preorder Traversal
PreorderTraversal
void PreorderTraversal(nodeT t) if (t !
null) cout ltlt t-gtkey ltlt endl
PreorderTraversal(t-gtleft)
PreorderTraversal(t-gtright)
Grumpy Doc Bashful Dopey Sleepy Happy Sneezy
14Inorder Traversal
InorderTraversal
void InorderTraversal(nodeT t) if (t !
null) InorderTraversal(t-gtleft)
cout ltlt t-gtkey ltlt endl InorderTraversal(t-gt
right)
Bashful Doc Dopey Grumpy Happy Sleepy Sneezy
15Postorder Traversal
PostorderTraversal
void PostorderTraversal(nodeT t) if (t !
null) PostorderTraversal(t-gtleft)
PostorderTraversal(t-gtright) cout ltlt
t-gtkey ltlt endl
Bashful Dopey Doc Happy Sneezy Sleepy Grumpy
16The bst.h Interface
/ File bst.h ----------- This file
provides an interface for a general binary
search tree class template. / ifndef
_bst_h define _bst_h include "cmpfn.h" /
Class BST ---------- This interface
defines a class template for a binary search
tree. For maximum generality, the BST is
supplied as a class template. The data type
is set by the client. The client specializes the
tree to hold a specific type, e.g. BSTltintgt
or BSTltstudentTgt. The one requirement on the
type is that the client must supply a a
comparison function that compares two elements
(or be willing to use the default comparison
function that relies on lt and ). / template
lttypename ElemTypegt class BST public
17The bst.h Interface
/ File bst.h ----------- This file
provides an interface for a general binary
search tree class template. / ifndef
_bst_h define _bst_h include "cmpfn.h" /
Class BST ---------- This interface
defines a class template for a binary search
tree. For maximum generality, the BST is
supplied as a class template. The data type
is set by the client. The client specializes the
tree to hold a specific type, e.g. BSTltintgt
or BSTltstudentTgt. The one requirement on the
type is that the client must supply a a
comparison function that compares two elements
(or be willing to use the default comparison
function that relies on lt and ). / template
lttypename ElemTypegt class BST public
18The bst.h Interface
/ Constructor BST Usage BSTltintgt bst
BSTltsonggt songs(CompareSong)
------------------------------------ The
constructor initializes a new empty binary search
tree. The one argument is a comparison
function, which is called to compare data
values. This argument is optional, if not
given, OperatorCmp from cmpfn.h is used, which
applies the built-in operator lt to its
operands. If the behavior of lt on your type
is defined and sufficient, you do not need to
supply your own comparison function. /
BST(int (cmpFn)(ElemType v1, ElemType v2)
OperatorCmp) / Destructor BST
Usage (usually implicit) ---------------------
---- This function deallocates the storage for
a tree. / BST()
19The bst.h Interface
/ Method find Usage if (bst.find(key)
! NULL)... -----------------------------------
---- This method applies the binary search
algorithm to find a particular key in this
tree. The argument is the key to use for
comparison. If a node matching key appears in
the tree, find returns a pointer to the data
in that node otherwise, find returns NULL.
/ ElemType find(ElemType key) /
Method add Usage bst.add(elem)
--------------------- This method adds a new
node to this tree. The elem argument is
compared with the data in existing nodes to find
the proper position. If a node with the same
value already exists, the contents are
overwritten with the new copy and false is
returned. If no matching node is found, a new
node is allocated and added to the tree, true
is returned. / bool add(ElemType elem)
20The bst.h Interface
/ Method remove Usage bst.remove(key)
----------------------- This method removes
a node in this tree that matches the specified
key. If a node matching key is found, the node
is removed from the tree and true is returned.
If no match is found, no changes are made and
false is returned. / bool remove(ElemType
key) / Method clear Usage
bst.clear() ------------------- This
method removes all elements from this tree. The
tree is made empty and will have no nodes after
being cleared. / void clear()
21The bstpriv.h Data Structure
/ File bstpriv.h --------------- This
file contains the private section of the bst.h
interface. / / Type definition for a node
/ struct nodeT ElemType data
nodeT left, right / Instance
variables / nodeT root(ElemType,
ElemType) / Root of the tree / int
(cmpFn)(ElemType, ElemType) / Comparison
function /
22The cmpfn.h Interface
/ File cmpfn.h ------------- This
interface exports a comparison function
template. / ifndef _cmpfn_h define
_cmpfn_h / Function template OperatorCmp
Usage int sign OperatorCmp(v1, v2)
--------------------------------------- This
function template is a generic function to
compare two values using the built-in and lt
operators. It is supplied as a convenience
for those situations where a comparison function
is required, and the type has a built-in
ordering that you would like to use.
/ template lttypename Typegt int OperatorCmp(Type
v1, Type v2) if (v1 v2) return 0 if
(v1 lt v2) return -1 return 1 endif
23The End