Title: CS473-Algorithms I
1CS473-Algorithms I
- Lecture X
- Augmenting Data Structures
2How to Augment a Data Structure
- FOUR STEP PROCEDURE
- 1. Choose an underlying data structure (UDS)
- 2. Determine additional info to be maintained in
the UDS - 3. Verify that additional info can be maintained
for the modifying operations on the UDS - 4. Develop new operations
- Note Order of steps may vary and be intermixed
- in real design.
3Example
- Design of our order statistic trees
- 1. Choose Red Black (R-B) TREES
- 2. Additional Info Subtree sizes
- 3. INSERT, DELETE gt ROTATIONS
- 4. OS-RANK, OS-SELECT
- Bad design choice for OS-TREES
- 2. Additional Info Store in each node its rank
in the subtree - OS-RANK, OS-SELECT would run quickly but
- Inserting a new minimum element would cause a
change to - this info in every node of the tree.
4Augmenting R-B Trees Theorem
- Theorem
- Let f be a field that augments a R-B Tree T of n
nodes - Suppose that fx for a node x can be computed
using only - The info in nodes, x, leftx, rightx
- f leftx and f rightx
- Proof Main idea
- Changing fx gt Update only fpx but
nothing else - Updating fpx gt Update only fppx but
nothing else - And so on up to the tree until frootx is
updated - When froot is updated, no other node depends on
new value - So the process terminates
- Changing an f field in a node costs O(lgn) time
since the height of a R-B tree is O(lgn)
5INTERVALSDEFINITIONS
- DEFINITION A Closed interval
- An ordered pair of real numbers t1,t2 with t1
t2 - t1,t2 t ? R t1 t t2
- INTERVALS
- Used to represent events that each occupy a
continuous period of time - We wish to query a database of time intervals to
find out what events occurred during a given
interval - Represent an interval t1,t2 as an object i,
with the fields lowi t1 highi t2 - Intervals i i' overlap if i n i' ? Ø that is
lowi high i' AND lowi' high i
6INTERVALSDEFINITIONS
- Any two intervals satisfy the interval trichotomy
- That is exactly one of the following 3 properties
hold - a) i and i' overlap
- i i i i
- i' i' i' i'
-
- b) highi lt lowi'
- i i'
- c) highi' lt lowi
- i' i
7INTERVAL TREES
- Maintain a dynamic set of elements with each
element x containing an interval intx - Support the following operations
- INSERT(T,x) Adds an element x whose int field
contains an interval to the tree - DELETE(T,x) Removes the element x from the tree
T - SEARCH(T,i) Returns a pointer to an element x in
T such that - intx overlaps with i'
- NIL if no such element in the set.
8INTERVAL TREES(Cont.)
- S1 Underlying Data Structure
- Choose R-B Tree
- Each node x contains an interval intx
- Key of xlow intx
- Inorder tree walk of the tree lists the intervals
in sorted order by low endpoints - S2 Additional information
- Store in each node x the maximum endpoint
maxx in the subtree rooted at the node
9EXAMPLE
7,10
5,11
17,19
4,8
15,18
21,23
int max
17,19 23
5,11 18
21,23 23
4,8 8
15,18 18
7,10 10
10INTERVAL TREES(cont.)
- S3 Maintaining Additional Info(maxx)
- maxx minimum highintx, maxleftx,
maxrightx - Thus, by theorem INSERT DELETE run in O(lgn)
time
11INTERVAL TREES(cont.)
- INSERT OPERATION
- Fix subtree Maxs on the way down
- As traverse path for INSERTION while comparing
new low to that of node intervals - Use new high to update max of nodes as
appropriate - Restore balance with rotations updating of max
fields for rotation - Z X
- ?
- X Y Right Rotate Y Z
- Thus, fixing max fields for rotation takes O(1)
time.
No change
11,35 35
6,20 35
6,20 20
30
14
11,35 35
14
19
19
14
12INTERVAL TREES(cont.)
- S4 Developing new operations
- INTERVAL-SEARCH(T,i)
- x ? rootT
- while x ?NIL and i nintx Ø do
- if leftx ?NIL and maxleftx lt lowi then
- x ? leftx
- else
- x ? rightx
- return x
13INTERVAL TREES(cont.)
- Time O(lgn)
- Starts with x at the root and proceeds downward
- On a single path, until
- EITHER an overlapping interval is found
- OR x becomes NIL
- Each iteration takes O(1) time
- Height of the tree O(lgn)
14Correctness of the Search Procedure
- Key Idea Need to check only 1 of the nodes 2
children - Theorem
- Case 1 If search goes right then
- Either overlap in the right subtree or
no overlap - Case 2 If search goes left then
- Either overlap in the left subtree or
no overlap
15Correctness of the Search Procedure
- Case 1 Go Right
- If overlap in right, then done
- Otherwise (if no overlap in RIGHT)
- Either leftx NIL ? No overlap in LEFT
- OR leftx ? NIL and maxleftx lt lowi
- For each interval i in LEFT
- highi lt maxleftx
- lt lowi
- Therefore, No overlap in LEFT
16Correctness of the Search Procedure
- Case 2 GO LEFT
- If overlap in left, then done
- Otherwise (if no overlap in LEFT)
- lowi lt maxleftx highi for some i in
LEFT - Since i i dont overlap and lowi lt highi
- We have highi lt low i (Interval Trichotomy)
- Since tree is sorted by lows we have
- highi lt lowiltAny lows in RIGHT
- Therefore, no overlap in RIGHT
17Pictorial View of Case 1 Case 2
- i i
- i
-
- i
- maxleftx maxleftx
-
- Case 1 t Case 2
- i any interval in left i any interval
in right - i in left such that highimaxleftx
18Interval Trees
- How to enumarate all intervals overlapping a
given interval - Can do in O(klgn) time,
- where k of overlapping intervals
- Find and Delete overlapping intervals one by one
- When done reinsert them
- Theoritical Best is O(klgn)
19How to maintain a dynamic set of numbers that
support min-gap operations
- MIN-GAP(Q) retuns the magnitude of the
difference of the two closest numbers in Q - Example Q1,5,9,15,18,22? MIN-GAP(Q) 18-15
3 - Underlying Data Structure
- A R-B Tree containing the numbers keyed on the
numbers - Additional Info at each Node
- min-gapx minimum gap value in the subtree TX
rooted at x - minx minimum value (key) in TX
- maxx maximum value (key) in TX
- These values are 8 if x is a leaf node
203. Maintaining the Additional Info
- minleftx if leftx NIL
- minx keyx otherwise
-
- minleftx if leftx NIL
- minx keyx otherwise
- min-gapleftx
- min-gapx Min min-gaprightx
- keyx maxleftx
- minrightx keyx
- Each field can be computed from info in the node
its children - Hence, by theorem, they would be maintained
during insert delete operation without
affecting the O(lgn) running time
21How to maintain a dynamic set of numbers that
support min-gap operations(cont.)
- The reason for defining the min max fields is
to make it possible to compute min-gap from
the info - at the node its children
- Develop the new operation MIN-GAP(Q)
- MIN-GAP(Q) simply returns the min-gap value of
the root - It is an O(1) time operation
- It is also possible to find the two closest
numbers in O(lgn) time
22How to maintain a dynamic set of numbers that
support min-gap operations(cont.)
- CLOSEST-NUMBERS(Q)
- x ? rootQ
- gapmin ? min-gapx
- while x ? NIL do
- if gapmin min-gapleftx then
- x ? leftx
- elseif gapmin min-gaprightx
- x ? rightx
- elseif gapmin keyx - maxleftx
- return keyx, maxleftx
- else
- return minrightx, keyx
23How to find the overlap of rectilinearly
rectangles
- Given a set R of n rectilinearly oriented
rectangles - i.e sides of all rectangles are paralled to the x
y axis - Each rectangle r R is represented with 4
values - xminr, xmaxr, yminr, ymaxr
- Give an O(nlgn)-time algorithm
- To decide whether R contains two rectangle that
overlap
24- OVERLAP(R)
- TY ? Ø
- SORT xmin xmax values of rectangles in R
- for each extremum x value in the sorted order do
- r ? rectangle x
- yint ? yminr, ymaxr
- if x xminr then
- v ? INTERVAL-SEARCH(TY, yint)
- if v ?NIL then
- return TRUE
- else
- z ? MAKE-NEW-NODE()
- leftz ? rightz ? pz ? NIL
- intz ? yint
- INSERT(TY,z)
- else / xxmaxr /
- DELETE(TY,yint)
- return FALSE
-