Trees General principles Ways of thinking - PowerPoint PPT Presentation

About This Presentation
Title:

Trees General principles Ways of thinking

Description:

Implementation used in Swing library (JDK1.2) for search trees. ... Red-black trees are what you get via TreeSet() And you can set/change the comparator ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 49
Provided by: dennis72
Learn more at: https://ics.uci.edu
Category:

less

Transcript and Presenter's Notes

Title: Trees General principles Ways of thinking


1
TreesGeneral principlesWays of thinking
  • Chapter 17 18 in DSPS
  • Chapter 4 in DSAA

2
Applications
  • Coding
  • Huffman, prefix
  • Parsing/Compiling
  • tree is standard internal representation for code
  • Information Storage/Retrieval
  • binary trees, AA-trees, AVL, Red-Black, Splay
  • Game-Playing (Scenario analysis)
  • virtual trees
  • alpha-beta search
  • Decision Trees
  • representation of choices
  • automatically constructed from data

3
General Trees
  • Tree Definition
  • distinguished root node
  • all other nodes have unique, sole parent
  • Depth of a node
  • number of edges from root to node
  • Height of a node
  • number of edges from node to deepest descendant
  • Balanced
  • Goal O(log n) insert/delete/find
  • height of any sons of any node differs by less
    than 1 (k)
  • K-arity
  • nodes have at most k sons

4
Depth of a Node
0
1
1
1
2
2
2
Often convenient to add another field to node
structure for additional information such as
depth, height, visited, cost, father, number of
visits, number of nodes below, etc.
5
Height of a Node
3
2
1
0
0
0
0
0
1
0
0
6
Simple Relationships
  • Leaf height is 0
  • Height of a node is 1maximum height of sons
  • Root depth is 0
  • Depth of a node is 1 depth of father
  • These can be computed recursively.

7
Three Tree Representations
  • List (variable number of children)
  • son representation
  • Object value
  • NodeList children
  • Sibling (variable number of children)
  • Sibling representation
  • Object value
  • Node child // the leftmost child
  • Node sibling // each node points
  • Array (k is bound on number of children)
  • Object value
  • Nodek children

8
Sibling Representation
a
d
b
c
d
e
f
a
d
c
b
e
f
d
9
Depth of node (list rep)
  • Recall depth(node) is number of links form node
    to root.
  • Idea
  • depth of sons is 1 depth of father
  • call depth(root, 0)
  • Define depth(node n,int d)
  • mark depth at node n d
  • for each son of n, call
    depth(son,d1) (use iterator)
  • Marking can be done in two ways
  • have an addition field (int depth) for each node
  • have an array int depthnumber of nodes

10
Depth of node (sibling rep)
  • Compute the depth of a node
  • Recall depth(node) is number of links form node
    to root.
  • Idea
  • depth of left son is 1 depth of father
  • depth of siblings is same as depth of father
  • Call depth(root, 0)
  • Define depth(node n, int d)
  • mark depth at node n as d
  • call depth(n.leftson,d1)
  • call depth(n.sibling, d)

11
Height of Node
  • List representation
  • if node is leaf, height 0
  • else height 1 max(height of sons)
  • Sibling representation
  • if node is leaf, height 0
  • else height max (1 height of leftson, max of
    heights of siblings)

12
Virtual Trees
  • Trees are often conceptual objects, but take too
    much room to store. Store only what is needed.
  • Representation
  • Node
  • object value
  • Node nextSon() returns null if no more sons,
    else returns the next son
  • In this representation you generate sons on the
    fly
  • E.G. in game playing typically only store depth
    of tree nodes.

13
Standard Operations
  • Copying
  • Traversals
  • preorder, inorder, postorder, level-order
  • illustrated with printing, but any processing ok
  • Find (Object o)
  • Insertion(Object o)
  • Deletion(Object o)
  • Complexity of these operations varies with
    constraints / structure of tree that must be
    preserved.

14
Binary Trees
  • Object Representation node has
  • Object value
  • Node left, right
  • Array Representation
  • use Object
  • requires you know size of tree, or use growable
    arrays
  • no pointer overhead
  • Trick if node is stored at i, then
  • left son stored at 2i
  • right son stored at 2i1
  • root stored at 1
  • father of node i is at i/2
  • Generalizes to k-ary trees naturally.

15
Binary Search Trees
  • Left
  • i.e. any descendant of a node in left is less
    than any descendant of a node in right.
  • Operations let d be depth of tree
  • object find(key k)
  • sometimes key and object are the same
  • insert(object o) or insert(key k, object o)
  • Object findMin()
  • removeMin()
  • removeElement(object o)
  • Cost all O(d) via separate and conquer

16
Removing elements is tricky
  • How would you remove value at root?
  • Plan for remove(object o)
  • 1. Find o, i.e. let n be node in tree with value
    o
  • 2. Keep a ptr to the father of n
  • 3. If ( n.right null) ptr.son n.left // not
    code
  • 4. Else
  • a. find min in n.right
  • b. remove min from n.right
  • c. ptr.son new node(min, n.left, n.right)
  • Assumes appropriate constructor.
  • Make pictures of the cases.

17
Support routines
  • BinaryNode findMin(BinaryNode n)
  • Recursively
  • if (n.left null) return n
  • else return left.findMin()
  • O(d) Time and Space
  • BinaryNode findMin(BinaryNode n)
  • Iteratively
  • while ( n.left !null) n n.left
  • return n
  • O(d) Time, O(1) space

18
Remove Min
  • removeMin(BinaryNode n) idea
  • Node n n.findMin()
  • father(n).right n.right
  • // idea ok, code not right
  • What if minimum is root?
  • BinaryNode removeMin(BinaryNode n)
  • if (n.left ! null)
  • n.left removeMin(n.left)
  • else
  • n n.right
  • return n

19
Min remove Examples
20
Remove Node Examples
a
b
c
d
e
f
g
21
removeNode
  • BinaryNode removeNode(BinaryNode x, BinaryNode n)
    // remove x from n
  • if (x
  • else if (xn) n.rightremoveNode(x, n.right)
  • // Now x n
  • else if (n.left ! null n.right !null)
  • n.data findMin(n.right).data
  • n.right removeMin(n.right)
  • else (// left or right is empty)
  • n (n.left ! null) ? N.left n.right
  • return n

22
Find a node (three meanings)
  • Search tree
  • given a node id, find id in tree.
  • Search tree
  • find a node with a specific property, e.g.
  • kth largest element (Order Statistic)
  • Separate and conquer answers in log(n) time
  • Arbitrary tree
  • find a node with a specific property
  • E.g. node is a position in game tree, find win
  • E.g. node is particular tour, find node(tour)
    with least cost

23
Separate and Conquer
  • Finding the kth smallest (Case Analysis)
  • Where can it be?

i nodes
N-i-1nodes
If at root, left subtree has k-1 nodes. If (ithen search for k-I-1 in right subtree If (ik)
then search for kth in right subtree. Complexity
depth of tree (log (n))
24
Analysis Definitions
  • Problem what is average time to find or insert
    an element
  • Definitions follow from problem
  • Internal path length of Binary tree (IPL)
  • sum of depth of nodes ipl
  • average cost of successful search average
    depth1 cost number of nodes you look at
  • External path length of Binary tree (EPL)
  • sum of cost of accessing all N1 null references
    epl
  • average cost of insertion or failed search
    epl/(N1)

25
Example of IPL and EXP
0
1
1
2
2
Null reference
IPL 1122 6
EPL 223333 16 IPL25 IPL2N
What happens if you remove a leaf?
26
Picture Proofof IPL related to IPL of subtrees
N node tree
I node subtree
N-I-1 node subtree
Each node (n-1 of them) had its path length
reduce by 1
27
Some Theorems
  • Average internal path length of binary search
    tree is 1.38NlogN
  • Proof that it is O(nlog n)
  • Let D(N) average ipl for tree with N nodes
  • D(0)D(1) 0.
  • D(i) average over all splits of tree (draw
    picture)
  • D(i) (left split) 1/N (D(0).D(N-1)) N-1
  • (right split) 1/N(..)
  • same as quicksort analysis (to be
    done)
  • O(NlogN)
  • Why does EPL IPL2N (induction)

28
Analysis Goal f(n) in terms of f(n-1)then expand
  • 2/n( D(0)D(n-1)) n D(n)
  • 2(D(0) D(n-1)) n2 nD(n)
  • Goal compare with previous, subtract and hope
  • 2(D(0)D(n-2)) (n-1)2 (n-1)D(n-1)
  • 2D(n-1) 2n-1 nD(n) - (n-1)D(n-1)
  • nD(n) (n1)D(n-1) 2n
  • D(n)/(n1) D(n-1)/n 2/(n1) EUREKA!
    Expand.
  • Hence D(n)/(n1) 2/(n1) 2/n .2/1

  • 2(harmonic series) is O(log n)
  • Conclusion D(n) is O(nlog(n))

29
1/11/21/n is O(log n)
  • General Trick sum approximates integral and vice
    versa
  • Area under function 1/x is given by log(x).

4
2
1
3
30
Balanced Trees
  • Depth of tree controls amount of work for many
    operations, so.
  • Goal keep depth small
  • what does that mean?
  • What can be achieved?
  • What needs to be achieved?
  • AVL 1962 - very balanced
  • Btrees 1972 (reduce disk accesses)
  • Red-Black 1978
  • AA 1993, a little faster now
  • Splay trees probabilistically balanced (on
    finds)
  • All use rotations

31
AVL Tree
  • Recall height of empty tree -1
  • In AVL tree, For all nodes, height of left and
    right subtrees differ by at most 1.
  • AVL trees have logarithmic height
  • Fibonacci numbers F11 F2 1 F32
    F43
  • Induction Strikes Thm Sh Fh3-1
  • Let Si size of smallest AVL tree of height i
  • S0 1 S12 why?
  • So S1 F4-1
  • ShSh-1Sh-21 Fh2-1Fh1-11
  • Fh3-1.
  • Hence number of nodes grows exponential with
    height.

32
On Insertion, what can go wrong?
  • Tree balanced before insertion

2
1
0
1
1
1
H-1
H
33
Insertion
  • After insertion, there are 4 ways tree can be
    unbalanced. Check it out.
  • Outside unbalanced handled by single rotations
  • Inside unbalanced handled by double rotations.

2
2
1
1
c
r
p
b
a
q
34
Maintaining Balance
  • Rebalancing single and double rotations
  • Left rotation after insertion

1
2
2
1
c
a
b
b
c
a
35
Another View
1
2
2
a
1
c
Left
b
c
a
b
1
2
Right
a
2
1
c
b
a
c
b
Notice what happens to heights
36
Another View
1
2
2
a
1
c
Left
b
c
a
b
1
2
Right
a
2
1
c
b
a
c
b
Notice what happens to heights, (LEFT) in
general a goes up 1, b stays the same, c goes
down 1
37
Single (left) rotation
  • Switches parent and child
  • In diagram static node leftRotate(node 2)
  • 1 2.left
  • 2.left 1.right
  • 1.right 2
  • return 1
  • Appropriate test question
  • do it, i.e. given sequence of such as 6, 2, 7,1,
    -1 etc show the succession on trees after
    inserts, rotations.
  • Similar for right rotation

38
Double Rotation (left)
3
1
Out of balance split
2
3
3
1
1
2
39
In Steps
3
3
d
2
d
1
c
1
2
a
a
b
c
b
2
3
1
c
d
b
a
40
Double Rotation Code (left-right)
  • Idea rotate left child with its right child
  • Then node with new left child
  • static BinaryNode doubleLeft( BinaryNode n)
  • n.left rotateRight(n.left)
  • return rotateLeft(n)
  • Analogous code for other middle case
  • All rotations are O(1) operations
  • Out-of-balance checked after insertion and after
    deletions. All O(1).
  • For AVL, d is O(logN) so all operations O(logN).

41
Red-Black Trees
  • Every node red or black
  • Root is black
  • If node red, children black
  • Every path from node to null has same number of
    black nodes
  • Implementation used in Swing library (JDK1.2) for
    search trees.
  • Single top-down pass means faster than AVL
  • Depth typically same as for AVL trees.
  • Code has many cases - skipping
  • Red-black trees are what you get via TreeSet()
  • And you can set/change the comparator

42
AA Trees
  • Simpler variant of Red-black trees
  • simpler more efficient
  • Add two more properties
  • 5. Left children may not be red.
  • 6. Remove colors, use levels
  • Leaves are at level 1
  • If red, level is level of parent
  • If black, level is level of parent-1
  • Code also has many special cases

43
B-tree of order M
  • Goal reduce the number of disk accesses
  • Generalization of binary trees
  • Method keep top of tree in memory and have large
    branching factor
  • Disk access 1000 times slower than memory access
  • M-ary tree yields O ( log (m/2 N)) accesses
  • Data stored only at leaves
  • Nonleaves store up to M-1 keys
  • Root is leaf or has 2M children
  • All internal nodes have (M1)/2M children
  • All leaves at same depth and have (L1)/2L
    children
  • Often set L M
  • Practical algorithm, but code longish (many cases)

44
B-Tree Picture internal node
Key
Ptrs
...
Goal Store as many keys a possible Keys are in
order M-1 Keys M ptrs Space MptrSize
(M-1)KeySize
45
Representation
  • Leaf nodes are arrays of size M (or linked lists)
  • Internal nodes are
  • array of size M-1 of keys
  • array of size M of pointers to nodes
  • The keys are in orders
  • Choice of M depends on machine architecture and
    problem.
  • M is argmax of
  • keySize(M-1) ptrSizeM

46
Example Analysis (all on disk)
  • Suppose a disk block holds 8,192 bytes.
  • Suppose each key is 32 bytes, each branch is 4
    bytes, and each data record is 256 bytes.
  • L 32 (8192/256)
  • If B-tree has order M, then M-1 keys.
  • An interior node holds 32M-32 M4 36M-32
    bytes.
  • Largest solution for M is 228.

47
Splay Trees
  • Like Splay lists, only probabilistically ordered
  • Goal minimize access time
  • Method no ordering on insert
  • Ordering on finds only ( as in splay lists)
  • Rotating inserted node up, moves node to root
    but makes tree unbalanced
  • Instead use double rotations zig-zag and zig-zig
  • This rebalances tree
  • Guarantees O(M log N) costs for M operations, ie.
    Amortized O(log N).

48
Summary
  • Depth of tree determines overall costs
  • Balancing achieved by rotations
  • AVL trees require 2 passes for insertion/deletions
  • a pass down to find the point
  • a pass up to do the corrections
  • Red-Black and AA trees require 1 pass
  • B-Trees are uses for accessing information that
    wont fit in memory
  • General CASE ANALYSIS, separate and conquer
Write a Comment
User Comments (0)
About PowerShow.com