Title: 2005MEE Software Engineering
12005MEE Software Engineering
2Topics
- Trees and graphs
- Binary Trees
- general structure
- implementation requirements
- recursion
- common operations
- traversal methods
- AVL Trees
- description and rationale
- algorithms for implementation
- Complexity considerations
3Trees and Graphs
- Trees and graphs are an efficient way of storing
large amounts of data - especially useful when relationships between data
are important - can represent equations, etc
- A graph is a set of element (nodes) connected by
lines (branches) - branches may be uni or bi-directional
- inward branches are indegree branches
- outward branches are outdegree branches
4Trees and Graphs
9
23
45
67
-3
0
5Trees
- Trees are a subset of graphs
- each node has at most ONE indegree branch
- root node has ZERO indegree branches
- can have multiple outdegree branches
- all branches are uni-directional
- Other tree terms
- leaf node node which has no outdegree branches
- internal node neither the root or a leaf
- parent has at least one successor node (child)
- child has at least one predecessor node
- ancestor any node on the path from this node to
root - descendent any node which has node as an
ancestor - node level distance from root node
- height largest node level in the tree 1
6Tree Example
root node
0
56
17
4
-3
25
leaf nodes
7Trees
- A tree can be considered as a collection of
sub-trees - each node is itself the root of a valid tree
- trees therefore are recursive structures
- recursive algorithms are therefore useful when
dealing with trees - Recursive definition of a tree
- A tree is either empty, OR
- has a value, and zero or more subtrees, which are
also trees
8Binary Trees
- A tree with at most two child nodes per parent
- typically labelled as left and right
- Possible to calculate
- maximum nodes for given height
- min and max height for give of nodes
- Binary tree properties
- balance difference in height of left and right
subtrees - complete how full the tree is
9Binary Search Trees (BST)
- Binary trees are not particularly useful as is
- however, by enforcing some simple constraints,
they become very useful - Typical constraint
- all values in left subtree lt node value
- all values in right subtree gt node value
- This allows for very fast (O(logN)) searching of
the tree - assuming tree is close to balanced
- highly unbalanced trees become linked lists!
10Binary Search Trees (BST)
11Binary Search Trees
- Operations are similar to set operations
- create
- add
- remove
- search
- empty
- destroy
- In general, all BST operations are recursive
- Complexities are almost all O(logN)
12BST Operations
- Adding a node
- recursive operation
- if value root value, do nothing
- if value lt root value
- if lchild is NULL, create new node for lchild,
else - add to left subtree
- if value gt root value
- if rchild is NULL, create new node for rchild,
else - add to right subtree
- Adding to left and right trees is done using same
function (recursively)
13Adding to BST
Adding element 13
14Adding to BST
- Special cases
- tree is empty, simply add value as root node
- Typically, a private function is used to perform
recursive insertion - public function simply calls private function on
root node - private functions are characterised by static
keyword - can only be called from within that file
- See example code
15Searching a BST
- Basic algorithm
- if root NULL, return not found
- if value root value, return found
- if value lt root value,
- search left subtree
- if value gt root value,
- search right subtree
- Recursive algorithm
- Very similar to adding a new value
16Deleting from a BST
- Most complex operation on BSTs
- involves significant tree manipulation in some
cases - Basic algorithm
- search for node (using search algorithm)
- if found, delete node
- Deleting a node
- if node is a leaf, simply remove and update
parent - if node has one child, replace node with child
- if node has two children, replace node with
either - largest value in left subtree
- smallest value in right subtree
17BST Deletion Example
18BST Deletion
- Also a recursive function
- calls itself to delete item from subtree during
search process - also called recursively to remove rightmost node
in left tree once value has been copied - Code is quite complex, large capacity for errors
- for exam, remember algorithm, NOT code!
19Binary Tree Traversal
- Traversal method of accessing every element in a
tree - can be done in many orders
- operation can be anything
- adding to another list, displaying, saving to
file, etc.. - Common traversal orders
- pre-order
- in-order
- post-order
20Pre-order Traversal
- Node is visited first, then left child, then
right child - will output items in an order that will allow
perfect reconstruction of tree
Pre-order traversal 7, 1 subtree, 19
subtree 7, 1, 19, 10 st, 25 st 7, 1, 19, 10,
8, 13, 25
21In-order Traversal
- Left subtree is traversed first, then this node,
then right subtree - will output items in ascending order
In-order traversal 1 subtree, 7, 19
subtree 1, 7, 10 st, 19, 25st 1, 7, 8, 10,
13, 19, 25
22Post-order Traversal
- Left subtree is traversed first, then right
subtree, then node - useful for certain types of trees (reverse polish)
Post-order traversal 1 subtree, 19 subtree,
7 1, 10 st, 25 st, 19, 7 1, 8, 13, 10, 25,
19, 7
23Breadth-order Traversal
- Tree is traversed by level
- root, then all level 1 nodes, all level 2 nodes,
etc.. - NOT recursive implemented with a queue
Breadth-order traversal 7, 1, 19, 10, 25, 8, 13
24Breadth-order Traversal
- The tree is processed as follows
- add the root node to the queue
- while the queue is not empty
- remove next tree from queue
- traverse the root of the tree
- add trees children (if any) to rear of queue
- This will provide the desired output
- see example in class
25Uses of Traversal
- Traversal can be used to perform any function on
all the nodes - in general, a function pointer is provided as an
argument to the traversal function - Example, for tree holding int values
- int traverse ( BST tree, int order,
- int (travfunc)(int))
Traversal function pointer
26Traversal Example
- int traverse ( BST tree, int order, int
(travfunc)(int)) - if ( tree ! NULL )
- / in-order traversal only for example /
- traverse ( tree-gtlchild, order, travfunc )
- travfunc ( tree-gtdata )
- traverse ( tree-gtrchild, order, travfunc )
-
-
- int printint ( int num )
- printf ( i, num )
-
- int main ( void )
- BST tree bst_create()
- traverse ( tree, IN_ORDER, printint )
27Tree Traversal
- A generic binary tree will contain void pointers
- traversal function should take void pointers as
argument, and cast to correct type - in general, contents (at least the key value)
should NOT be modified as this may invalidate
tree structure - strictly monotonic functions are OK
- Some uses of traversal
- copying the tree
- displaying the tree
- saving the tree to file (pre-order)
- putting tree into other data structure (list, etc)
28AVL Trees
- Binary trees can become very unbalanced
- poor order of insertion
- data isnt very random
- bad luck
- Unbalanced trees are inefficient (slow)
- To increase efficiency, a method of balancing
trees is required - AVL trees (Adelson-Velskii and Landis) do this
- An AVL tree is a binary tree in which, for ALL
subtrees of the tree - Height(left)-Height(right) lt 1
29AVL Tree Functionality
- Each node has an extra field
- the balance field
- Three possible values
- left high (1), right high (-1), even (0)
- this value indicates which subtree of the node is
higher - This field is updated after every operation to
reflect new status of tree - clearly, it is modified by insertions and
deletions
30AVL Tree Functionality
- Rebalancing is required when
- left subtree of a LH tree increases in height
- right subtree of a RH tree increases in height
- Within these categories, either the left or right
branch of the subtree can grow - fixing the tree is dependent upon which branch it
is - Unbalance can be solved by rotating the tree
about the root (1 or more times)
31AVL Tree Examples
- Simplest cases
- left of left adding to the left side of the left
subtree of a node which is LH
18
LH
20
12
EV
EV
8
14
EV
EV
4
New node to be added
32AVL Tree Examples
Height of the LEFT child has increased, was
already LH, so unbalanced
18
LH
New node was added to LEFT child of LEFT subtree
20
12
EV
LH
8
14
EV
LH
4
EV
33AVL Tree Examples
Solution Rotate out of balance node to the RIGHT
Old root becomes the RIGHT child
18
LH
This node becomes the root
20
12
EV
LH
8
14
EV
LH
Right child of left child (if any) becomes LEFT
child of old root node
4
EV
34AVL Tree Examples
Root node is now balanced (EV)
12
EV
8
18
EV
LH
20
14
4
EV
EV
EV
35AVL Tree Examples
- Exactly the same procedure if the right child of
a RH node becomes higher - rotate to the left
14
RH
12
20
EV
RH
18
23
LH
EV
22
EV
New node
36AVL Tree Examples
20
EV
14
23
EV
LH
22
12
18
EV
EV
EV
37More Complex Balancing
- Slightly more complex operation is required if
the RIGHT child of the LEFT subtree of a LH node
is increase - right of left unbalanced tree
18
LH
20
12
EV
EV
4
14
EV
EV
16
New node to be added
38More Complex Balancing
- Root node is now unbalanced
- left child is RIGHT HIGH
- Solution
- rotate left child to the LEFT, then root to the
RIGHT
18
LH
20
12
EV
RH
4
14
RH
EV
16
39More Complex Balancing
- After first rotation
- now the left child is LEFT HIGH as in previous
example
18
LH
20
14
EV
LH
12
16
EV
LH
4
EV
40More Complex Balancing
- Rotate root to right
- Tree is now balanced
14
EV
18
12
EV
LH
EV
20
16
4
EV
EV
41More Complex Balancing
- Identical process for left of right unbalanced
node - rotate right child to the LEFT
- rotate node to the RIGHT
- See examples in lecture
42Automatic Balancing
- Checking the balance of a tree is an order O(n)
operation - not efficient to do this after every
insert/delete! - Need a fast way of updating balance information
- Solution
- after every insert, inform parent node if height
has increased - after every delete, inform parent node if height
has decreased - if so, update balance of parent node
- As these are recursive, information is returned
from recursive calls
43Huffman Trees
- Huffman trees are an interested application of
binary (not BST) trees - They are used to perform data compression and
encoding - useful when relative probabilities of data
patterns are unequal - example letters in English text
- t, e, a are much more common than x, z,
q - Huffman trees capture this inequality and create
optimal bit patterns to represent each symbol
44Creating Huffman Trees
- All symbols (letters) are stored in ascending
order of frequency - for example, in an English message, the
frequencies might be - A 10, C 3, D 4, E 15, G 2, I 4, K 2, M 3,
N 6, O 8, R 7, S 5, T 12, U 5 - These are then stored in ascending order, so K,
G, M, C, , E
45Creating Huffman Trees
- Each symbol and its frequency can be considered
as individual binary trees - each starts as a single, unconnected node
- To combine into single tree
- remove two trees with LOWEST frequencies
- make these the left and right children of a new
node - new node has frequency equal to sum of these
trees - insert new tree into sorted list
- repeat until single tree remains
46Huffman Tree Example
G 2
O 8
R 7
N 6
S 5
U 5
I 4
D 4
M 3
C 3
K 2
A 10
T 12
E 15
Original Nodes (all single)
47Huffman Tree Example
O 8
R 7
N 6
S 5
U 5
I 4
D 4
M 3
C 3
4
G 2
K 2
A 10
T 12
E 15
G and K combined into single node with frequency
4
48Huffman Tree Example
O 8
R 7
N 6
S 5
U 5
I 4
D 4
A 10
T 12
E 15
C and M combined into single node with frequency
6
49Huffman Tree Example
O 8
R 7
N 6
S 5
U 5
I 4
A 10
T 12
E 15
D and other node combined into single node with
value 8
50Huffman Tree Example
After many such merges..
51Huffman Tree Example
86
The final tree!
52Calculating the Code
- Code for each symbol (letter) is created by
following the branches from the root node - every left branch 0
- every right branch 1
- Final combination is the code
- NOTE variable length!
53Code Creation Example
86
1
0
35
51
0
0
16
23
1
0
O 8
8
T 12
0
D 4
4
1
G 2
K 2
T 101 K 00001
54Huffman Encoding
- High frequency symbols (T, A, E) have low number
of bits - Low frequency symbols (G, K, C, M) have high
number of bits - Unbalanced tree leads to HIGH compression
- most common values represented with very few bits
- Balanced tree leads to LOW compression
- all symbols represented with equal of bits
55Huffman Coding
- All symbols are leaf nodes of the tree
- When transmitting or storing data
- tree is sent/stored first
- each symbol is encoded and sent in single
bitstream - Decoding is unambiguous since once a leaf is
reached no further traversal is possible - see example in lecture