Title: 2IL05 Data Structures 2IL06 Introduction to Algorithms
12IL05 Data Structures 2IL06 Introduction to
Algorithms
- Spring 2009Lecture 9 Augmenting Data Structures
2Announcements
- 2IL06
- last lecture with relevant material
- you need at least 50 points (from 5 assignments)
to participate in the exam
3Data structures
- Data structures are used in many applications
- directly the user repeatedly queries the data
structure - indirectly to make algorithms run faster
- In most cases a standard data structure is
sufficient (possibly provided by a software
library) - But sometimes one needs additional operations
that arent supported by any standard data
structure - ? need to design new data structure?
- Not always often augmenting an existing
structure is sufficient
4Example
- S set of elements, each with an unique key.
- OperationsSearch(S, k) return a pointer to an
element x in S with keyx k, or NIL if such
an element does not exist. - OS-select(S, i) return a pointer to an element
x in S with the ith smallest key (the key
with rank i) - Solution
sorted array
the key with rank i is stored in Ai
5Example
- S set of elements, each with an unique key.
- OperationsSearch(S, k) return a pointer to an
element x in S with keyx k, or NIL if such
an element does not exist. - OS-select(S, i) return a pointer to an element
x in S with the ith smallest key (the key
with rank i) - Insert(S, x) inserts element x into S, that
is, S ? S ? x - Delete(S, x) remove element x from S
Solution?
6Idea Use red-black trees
- OS-select(S, 3) report key with rank 3
- Idea 1 store the rank of each node in
the node
Is the key with rank 3 in the left subtree, in
the right subtree, or in the root?
2 1
?
7Idea 1 Store the rank in each node
Insertion can change the rank of every node!
? worst case O(n)
10 2
2 1
18 5
50 6
12 3
17 4
8Idea 1 Store the rank in each node
- Problem
- Idea 2 store the size of the subtree in
each node
Insertion can change the rank of every node!
? worst case O(n)
10 2
2 1
18 5
50 6
12 3
17 4
9Idea 2 Store the size of the subtree
10 6
2 1
18 4
50 1
12 2
17 1
10Idea 2 Store the size of the subtree
- store in each node x
- left x, right x
- parent x
- key x
- color x
- size x number of keys in
subtree rooted at x (size NIL 0)
order-statistic tree
11Order-statistic trees OS-Select
- OS-Select(x, i) return pointer to node
containing the ith smallest key of the subtree
rooted at x - OS-Select(x , i)
- r ? sizeleftx 1
- if i r
- then return x
- else if i lt r
- then return OS-Select(leftx, i)
- else return OS-Select(rightx, i)
- Running time?
OS-Select(x, 17)
r 17
r 25
r 16
x
size 16
size 15
size 24
i-r
O(log n)
12Order-statistic trees OS-Rank
- OS-Rank(T, x) return the rank of x in the
linear order determined by an inorder walk of T - 1 number of keys smaller than x
x
13Order-statistic trees OS-Rank
- OS-Rank(T, x) return the rank of x in the
linear order determined by an inorder walk of T - 1 number of keys smaller than x
- OS-Rank(T, x)
- r ? sizeleftx 1
- y ? x
- while y ? root T
- do if y rightpy
- then r ? r size leftpy 1
- y ? py
- return r
x
Running time?
O(log n)
14OS-Rank Correctness
- OS-Rank(T, x)
- r ? sizeleftx 1
- y ? x
- while y ? root T
- do if y rightpy
- then r ? r size leftpy 1
- y ? py
- return r
- InvariantAt the start of each iteration of the
while loop, r rank of keyx in Ty - Initialization
- r rank of keyx in Tx (y x)
- number of keys smaller than keyx in Tx 1
- sizeleftx 1
(binary-search-tree property)
subtree with root y
15OS-Rank Correctness
- OS-Rank(T, x)
- r ? sizeleftx 1
- y ? x
- while y ? root T
- do if y rightpy
- then r ? r size leftpy 1
- y ? py
- return r
- InvariantAt the start of each iteration of the
while loop, r rank of keyx in Ty - Termination
- loop terminates when y rootT
- ? subtree rooted at y is entire tree
- ? r rank of keyx in entire tree
16OS-Rank Correctness
- OS-Rank(T, x)
- r ? sizeleftx 1
- y ? x
- while y ? root T
- do if y rightpy
- then r ? r size leftpy 1
- y ? py
- return r
- InvariantAt the start of each iteration of the
while loop, r rank of keyx in Ty - Maintenance
- case i y rightpy
- ? all keys Tleftpy and keypy smaller
than keyx - ? rank keyx in Tpy rank keyx in Ty
size leftpy 1
17OS-Rank Correctness
- OS-Rank(T, x)
- r ? sizeleftx 1
- y ? x
- while y ? root T
- do if y rightpy
- then r ? r size leftpy 1
- y ? py
- return r
- InvariantAt the start of each iteration of the
while loop, r rank of keyx in Ty - Maintenance
- case ii y leftpy
- ? all keys Trightpy and keypy larger
than keyx - ? rank keyx in Tpy rank keyx in Ty
18Order-statistic trees Insertion and deletion
- Insertion and deletionas in a regular red-black
tree, but we have to update sizex field
19Red-black trees Insertion
- Do a regular binary search tree insertion
- Fix the red-black properties
- Step1
- find the leave where the node should be inserted
- replace the leave by a red nodethat contains the
key to be inserted - size of the new node 1
- increment size of each node on the search path
15
1
1
1
50
12
17
1
sizex 1
15
20Red-black trees Insertion
- Do a regular binary search tree insertion
- Fix the red-black properties
- Red-black properties
- Every node is either red or black.
- The root is black
- Every leaf (nilT) is black.
- If a node is red, then both its children are
black. - For each node, all paths from the node to
descendant leaves contain the same number of
black nodes.
The new node is red ? Property 2 or 4 can be
violated. Remove the violation by rotations and
recoloring.
50
12
17
15
21Rotation
right rotation around y
left rotation around x
- A rotation affects only sizex and sizey
- We can determine the new values based on the size
of children - sizex sizeleftx sizerightx 1
- and the same for y
22Order-statistic trees
- The operations Insert, Delete, Search,
OS-Select, and OS-Rank can be executed with an
order-statistic tree in O(log n) time. -
23Augmenting data structures
- Methodology for augmenting a data structure
- Choose an underlying data structure.
- Determine additional information to maintain.
- Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations.
- You dont need to do these steps in strict order!
- Red-black trees are very well suited to
augmentation
- OS tree
- R-B tree
- sizex
- maintain size during insert and delete
- OS-Select and OS-Rank
24Augmenting red-black trees
- TheoremAugment a R-B tree with field f, where
fx depends only on information in x, leftx,
and rightx (including fleftx and
frightx). Then can maintain values of f in
all nodes during insert and delete without
affecting O(log n) performance. - When we alter information in x, changes
propagate only upward on the search path for x
25Augmenting red-black trees
- TheoremAugment a R-B tree with field f, where
fx depends only on information in x, leftx,
and rightx (including fleftx and
frightx). Then can maintain values of f in
all nodes during insert and delete without
affecting O(log n) performance. - Proof (insert)
- Step 1 Do a regular binary search tree
insertion
- go up from inserted node and update f
- additional time
O(log n)
26Augmenting red-black trees
- TheoremAugment a R-B tree with field f, where
fx depends only on information in x, leftx,
and rightx (including fleftx and
frightx). Then can maintain values of f in
all nodes during insert and delete without
affecting O(log n) performance. - Proof (insert)
- Step 2 Fix the red-black properties by
rotations and recoloring
- update f for x, y, and their ancestors
- additional time per rotation
O(log n)
27Example Interval trees
28Interval trees
- S set of closed intervals
- Operations
- Interval-Insert(T, x) adds an interval x, whose
int field is assumed to contain an interval,
to the interval tree T. - Interval-Delete(T, x) removes the element x
from the interval tree T. - Interval-Search(T, j) returns pointer to a node
x in T such that intx overlaps j, or nilT
if no such element exists.
closed endpoints are part of the interval
j
i
lowi
highi
29Methodology
- Choose an underlying data structure.
- Determine additional information to maintain.
- Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations.
30Methodology
- Choose an underlying data structure.
- use red-black trees
- each node x contains interval intx
- key is left endpoint lowintx
- inorder walk would list intervals sorted by low
endpoint - Determine additional information to maintain.
- Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations.
31Methodology
- Choose an underlying data structure. ?
- Determine additional information to maintain.
- Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations.
32Additional information for Interval-Search
? report i
low. lowi
low. gt lowi
i
33Additional information for Interval-Search
- case 1 i n j ? Ø
- case 2 j lies left of i
- ? j cannot overlap any interval in the
right subtree -
? report i
low. lowi
low. gt lowi
i
34Additional information for Interval-Search
- case 1 i n j ? Ø
- case 2 j lies left of i
- ? j cannot overlap any interval in the
right subtree - case 3 j lies right of i
- ? need additional information!
-
? report i
low. lowi
low. gt lowi
i
maxx max endpoint value in subtree rooted
at x maxhighi where i is stored in the
subtree rooted at x
35Additional information for Interval-Search
- case 1 i n j ? Ø
- case 2 j lies left of i
- ? j cannot overlap any interval in the
right subtree - case 3 j lies right of i
- ? j overlaps interval in left
subtree if and only if lowj lt
maxlefti -
? report i
low. lowi
low. gt lowi
i
maxx max endpoint value in subtree rooted
at x maxhighi where i is stored in the
subtree rooted at x
36Methodology
- Choose an underlying data structure. ?
- Determine additional information to maintain.
? - Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations.
37Interval-Search
- Interval-Search(T, j)
- x ? root T
- while x ? nilT and j does not overlap intx
- do if leftx ? NIL and maxleftx
lowj - then x ? leftx
- else x ? rightx
- return x
- Correctness
- Invariant
- If tree T contains an interval that overlaps j,
then there is such an interval in the subtree
rooted at x. - Running time?
O(log n)
38Methodology
- Choose an underlying data structure. ?
- Determine additional information to maintain.
? - Verify that we can maintain additional
information for existing data structure
operations. - Develop new operations. ?
39Maintain additional information
- TheoremAugment a R-B tree with field f, where
fx depends only on information in x, leftx,
and rightx (including fleftx and
frightx). Then can maintain values of f in
all nodes during insert and delete without
affecting O(log n) performance. - Additional informationmaxx max endpoint
value in subtree rooted at x - maxx depends only on
- information in x highintx
- information in leftx maxleftx
- information in rightx maxrightx
- ? insert and delete still run in O(log n) time
40Search for all intervals
- Interval-Search-All(T, j)report all intervals
that overlap j - Worst case running time?
- How? See Assignment 7
- There is another type of interval tree that can
answer this query in T(k log n) time.
T(k log n) where k of reported intervals
41Another interval tree
- use a red-black tree, key is left endpoint low.
-
- store each interval i ? S in the highest node x
such that i contains lowx -
i1
42Another interval tree
- use a red-black tree, key is left endpoint low.
- i2, i3, i1 i2, i1, i3
- store each interval i ? S in the highest node x
such that i contains lowx - ? each node can store several intervals
- ? use two sorted lists (one on low. and one on
high.)
i2
i3
i1
43Tutorials this week
- Small tutorials on Tuesday 34.
- No Wednesday 78 big tutorial.
- No small tutorial Friday 78, but next week
Friday 78!