Title: Data Structures
1Data Structures
- Dynamic Sets
- Heaps
- Binary Trees Sorting
- Hashing
2Dynamic Sets
- Data structures that hold elements indexed with
(usually unique) keys - Support of some basic operations such as
- Search for an element with a given key
- Insert a new element
- Delete an element
- Find the minimum-key element
- Find the maximum-key element
- Elements can have unique or non-unique keys,
depending on the application - Keys can be ordered (e.g., intergers)
3Some operations on dynamic sets
- SEARCH(S, k)
- Given set S, key k, return x such that key(x)
k, or NIL if not found - INSERT(S, x)
- Augment S by adding x to it
- DELETE(S, x)
- Delete x from S
- MINIMUM(S), MAXIMUM(S)
- Return element x in S with minimum (maximum)
key(x) - SUCCESSOR(S, x)
- Given x, return y in S with minimum key(y) gt
key(x), or NIL if key(x) is maximum - PREDECESSOR(S, x)
4Data Structures for Sets
- Several data structures can support sets
- Arrays
- Linked Lists
- Trees
- Heaps
- Hash Tables
- etc
- Depending on the required set of operations, and
on their frequency, different data structures are
preferable
5Example Linked Lists
9
16
4
1
NIL
headL
9
16
4
1
NIL
headL
- A list L consists of a head, headL, and a set
of ltkey, next_ptrgt values - Every list ends with NIL
- Lists can additionally point to objects indexed
by the keys
6Variants of Linked Lists
9
16
4
1
NIL
headL
NIL
9
16
4
NIL
1
headL
16
4
1
nilL
- Simple
- Doubly linked
- Circular
- Notice the sentinel NIL
7Implementing Sets as Lists
- HEAD(L) return nilL.next
- TAIL(L) return nilL.prev
- INSERT(L, x)
- x.next HEAD(L)
- HEAD(L).prev x
- HEAD(L) x
- x.prev nilL
111
x
4
1
nilL
8Implementing Sets as Lists
- HEAD(L) return nilL.next
- TAIL(L) return nilL.prev
- INSERT(L, x)
- x.next HEAD(L)
- HEAD(L).prev x
- HEAD(L) x
- x.prev nilL
111
x
4
1
nilL
9Implementing Sets as Lists
- HEAD(L) return nilL.next
- TAIL(L) return nilL.prev
- INSERT(L, x)
- x.next HEAD(L)
- HEAD(L).prev x
- HEAD(L) x
- x.prev nilL
111
x
4
1
nilL
10Implementing Sets as Lists
- HEAD(L) return nilL.next
- TAIL(L) return nilL.prev
- INSERT(L, x)
- x.next HEAD(L)
- HEAD(L).prev x
- HEAD(L) x
- x.prev nilL
111
x
4
1
nilL
11Implementing Sets as Lists
- HEAD(L) return nilL.next
- TAIL(L) return nilL.prev
- INSERT(L, x)
- x.next HEAD(L)
- HEAD(L).prev x
- HEAD(L) x
- x.prev nilL
111
x
4
1
nilL
12Implementing Sets as Lists
- LIST-DELETE(L, x)
- x.prev.next x.next
- x.next.prev x.prev
- LIST-SEARCH(L, k)
- x HEAD(L)
- while x ! nil(L) and x.key ! k
- x x.next
- return x
- How fast do INSERT and DELETE run?
- How fast does LIST-SEARCH run?
13Sorted Lists
- INSERT x
- Search for y such that y.key x.key y.next.key
- DELETE x
- Same as unsorted lists
- MINIMUM
- Return HEAD(L)
- MAXIMUM
- Return TAIL(L)
- LIST-SEARCH x
- Same as unsorted lists
- EXTRACT-MAX
- DELETE MAXIMUM
14Running Times of Basic Operations
15Heaps
- Heap a very efficient binary tree data structure
- Heap operations
- INSERT
- DELETE
- EXTRACT-MAX
- Heapsort
- A priority queue based on heaps
- Heap operations
- DECREASE-KEY
16Heaps Definition
Contents of the heap
A
16
14
7
9
10
8
3
2
4
1
heap_size(A)
length(A)
- A heap is an array A1,,length(A)
- heap_size(A) length(A), is the size of the heap
- A1, , Aheap_size(A) are elements in the heap
17Heaps Definition
A
16
14
7
9
10
8
3
2
4
1
- A heap is also a binary tree
- PARENT(i)
- return ?i/2?
- LEFT(i)
- return 2i
- RIGHT(i)
- return 2i1
16
14
10
7
9
8
3
2
4
1
18Heaps Definition
A
16
14
7
9
10
8
3
2
4
1
- The heap property
- For every i gt 1,
- APARENT(i) gt Ai
16
14
10
7
9
8
3
2
4
1
19Maintaining the Heap Property
A
4
16
7
9
10
14
3
2
8
1
4
- Example
- Heap property is violated at root
- Left and right trees are heaps
- HEAPIFY(A,1) will fix this
16
10
7
9
14
3
2
8
1
20Maintaining the Heap Property
A
16
4
7
9
10
14
3
2
8
1
- HEAPIFY
- Propagate the problem down
16
4
10
7
9
14
3
2
8
1
21Maintaining the Heap Property
A
16
14
7
9
10
4
3
2
8
1
- HEAPIFY
- Propagate the problem down
- At each step, replace
- problem node with
- largest child
16
14
10
7
9
4
3
2
8
1
22Maintaining the Heap Property
A
16
14
7
9
10
8
3
2
4
1
- HEAPIFY
- Propagate the problem down
- At each step, replace
- problem node with
- largest child
16
14
10
7
9
8
3
2
4
1
23Maintaining the Heap Property
A
4
16
7
9
10
14
3
2
8
1
- HEAPIFY(A, i)
- l LEFT(i)
- r RIGHT(i)
- if l lt heap_size(A) and AlgtAi
- then max l
- else max i
- if r lt heap_size(A) and Ar gt Amax
- then max r
- if max ! i
- then exchange(Ai, Amax)
- HEAPIFY(A, max)
4
16
10
7
9
14
3
2
8
1
24Heaps are balanced
- Claim The size of each child heap is lt 2/3 N ,
where N is the size of the parent - Proof
- Let N parent heap size
- A, B child heap sizes
- Worst case a 2k, b 0
- B 1 2k-1 2k 1
- A 1 2k 2B 1
- N 3B2
- A/N 2(B1) 1 / 3(B1) 1
- lt 2(B1)/3(B1) 2/3
k
a
b
A 1 2k-1 a
B 1 2k-1 b
N 1 2 2k a b A B
25HEAPIFY runs in time O(log n)
- T(n) lt T(2/3 n) ?(1) ?(log n) by the Master
Theorem - Case 2 T(n) 1 T(n/ (3/2) ) c
- f(n) c
- a 1 b 3/2
- nlogba nlog3/21 n0 1
- f(n) ?(nlogba) therefore
- T(n) ?(nlogba log n) ?(log n)
- Alternatively, T(n) O(h), where h is height of
the heap
26Building a Heap
A
4
1
16
9
3
2
10
14
8
7
4
- Build a heap starting from an unordered array
1
3
16
9
2
10
14
8
7
27Building a Heap
A
4
1
16
9
3
2
10
14
8
7
4
- The leafs are already heaps of size 1
1
3
16
9
2
10
14
8
7
28Building a Heap
A
4
1
16
9
3
2
10
14
8
7
4
- Go up one-by-one to the root, fixing the heap
property
1
3
16
9
2
10
14
8
7
29Building a Heap
A
4
1
16
9
3
14
10
2
8
7
4
- Go up one-by-one to the root, fixing the heap
property - Do that by running
- HEAPIFY
1
3
16
9
14
10
2
8
7
30Building a Heap
A
4
1
16
9
10
14
3
2
8
7
4
- Go up one-by-one to the root, fixing the heap
property - Do that by running
- HEAPIFY
1
10
16
9
14
3
2
8
7
31Building a Heap
A
4
16
7
9
10
14
3
2
8
1
4
- Each HEAPIFY takes time O(height)
16
10
7
9
14
3
2
8
1
32Building a Heap
A
16
14
7
9
10
8
3
2
4
1
16
- Each HEAPIFY takes time O(height)
14
10
7
9
8
3
2
4
1
33Building a Heap
A
16
14
7
9
10
8
3
2
4
1
16
- BUILD-HEAP(A)
- heap_sizeA length(A)
- For i ?length(A)/2? downto 1
- HEAPIFY(A, i)
14
10
7
9
8
3
2
4
1
34Running Time of BUILD-HEAP
- How fast does BUILD-HEAP run???
- Here is a bound
- At most N heap_size(A) calls to HEAPIFY
- Each call takes at most O(log N) time
- Therefore, running time is O(N log N)
- Are we done?
35Two lemmas left as exercises
- Lemma 1
- An N-element heap has height ?lg N?
- Lemma 2
- An N-element heap has at most ?N / 2h1? nodes of
height h
3
2
height h 3
1
0
36Running Time of BUILD-HEAP
- Each HEAPIFY takes time O(h)
- HEAPIFY is called at most once/node
- T(N) ?h 0?lg N? ?N / 2h1? O(h)
- O(N ?h 0?lg N? h/2h)
-
- ?h 0?lg N? h/2h lt ?h 0? h(1/2)h
- (1/2)/(1-1/2)2 2, by A.8
- O(2N) O(N)
!
37Heapsort
A
16
14
7
9
10
8
3
2
4
1
16
1
16
1
16
16
16
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
14
10
7
9
8
3
2
4
1
1
1
38Heapsort
A
16
14
7
9
10
8
3
2
4
1
16
1
1
16
16
16
1
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
14
10
7
9
8
3
2
4
1
1
16
39Heapsort
A
16
14
7
9
10
8
3
2
4
1
16
1
1
16
16
16
1
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
14
10
7
9
8
3
2
4
1
1
16
40Heapsort
A
14
8
7
9
10
4
3
2
1
1
1
16
14
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
10
7
9
4
3
2
1
1
1
16
41Heapsort
A
1
8
7
9
10
4
3
2
14
1
1
16
1
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
10
7
9
4
3
2
14
1
1
16
42Heapsort
A
1
8
7
9
10
4
3
2
14
1
1
16
1
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
10
7
9
4
3
2
14
1
1
16
43Heapsort
A
10
8
7
1
9
4
3
2
14
1
1
16
10
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
9
7
1
4
3
2
14
1
1
16
44Heapsort
A
2
8
7
1
9
4
3
10
14
1
1
16
2
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
9
7
1
4
3
10
14
1
1
16
45Heapsort
A
2
8
7
1
9
4
3
10
14
1
1
16
2
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
9
7
1
4
3
10
14
1
1
16
46Heapsort
A
9
8
7
1
3
4
2
10
14
1
1
16
9
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
8
3
7
1
4
2
10
14
1
1
16
47Heapsort
A
8
7
2
1
3
4
9
10
14
1
1
16
8
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
7
3
2
1
4
9
10
14
1
1
16
48Heapsort
A
7
4
2
8
3
1
9
10
14
1
1
16
7
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
4
3
2
8
1
9
10
14
1
1
16
49Heapsort
A
4
2
7
8
3
1
9
10
14
1
1
16
7
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
4
3
7
8
1
9
10
14
1
1
16
50Heapsort
A
3
2
7
8
1
4
9
10
14
1
1
16
3
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
2
1
7
8
4
9
10
14
1
1
16
51Heapsort
A
2
1
7
8
3
4
9
10
14
1
1
16
2
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
1
3
7
8
4
9
10
14
1
1
16
52Heapsort
A
1
2
7
8
3
4
9
10
14
1
1
16
1
- HEAPSORT(A)
- BUILD-HEAP(A)
- for I length(A) downto 2
- exchange(A1, Ai)
- heap_size(A)--
- HEAPIFY(A, 1)
2
3
7
8
4
9
10
14
1
1
16
RUNNING TIME?
53Priority Queues
PRIORITY QUEUE
- A priority queue S is a data structure
supporting - MAXIMUM(S)
- Returns maximum key in S
- EXTRACT-MAX(S)
- Removes maximum key in S, and returns it
- INCREASE-KEY(S, xptr, xnew)
- Increases xold, stored in xptr, into xnew gt xold
- INSERT(S, x)
- S S ? x
54Heaps as Priority Queues
- Heaps can be efficient priority queues
- MAXIMUM is implemented in O(1)
- MAXIMUM(A)
- return A1
- EXTRACT-MAX is implemented in ??
- EXTRACT-MAX(A)
- if heap_size(A) lt 1 then error(underflow)
- max A1
- A1 Aheap_size(A)
- heap_size(A)--
- HEAPIFY(A, 1)
- return max
55Heaps as Priority Queues
- INCREASE-KEY is implemented in O(log n)
- INCREASE-KEY(A, i, key)
- if key lt Ai then error(key too small)
- Ai key
- while igt1 APARENT(i) lt Ai
- exchange(Ai, APARENT(i)
- i PARENT(i)
56Example of INCREASE-KEY
A
16
14
7
9
10
8
3
2
4
1
16
INCREASE-KEY(A, i, key) if key lt Ai then
error(key too small) Ai key while i gt 1
and APARENT(i) lt Ai exchange (Ai,
APARENT(i)) i PARENT(i)
14
10
7
9
8
3
2
4
1
57Example of INCREASE-KEY
A
16
14
7
9
10
8
3
2
15
1
16
INCREASE-KEY(A, i, key) if key lt Ai then
error(key too small) Ai key while i gt 1
and APARENT(i) lt Ai exchange (Ai,
APARENT(i)) i PARENT(i)
14
10
7
9
8
3
2
15
1
58Example of INCREASE-KEY
A
16
14
7
9
10
15
3
2
8
1
16
INCREASE-KEY(A, i, key) if key lt Ai then
error(key too small) Ai key while i gt 1
and APARENT(i) lt Ai exchange (Ai,
APARENT(i)) i PARENT(i)
14
10
7
9
15
3
2
8
1
59Example of INCREASE-KEY
A
16
15
7
9
10
14
3
2
8
1
16
INCREASE-KEY(A, i, key) if key lt Ai then
error(key too small) Ai key while i gt 1
and APARENT(i) lt Ai exchange (Ai,
APARENT(i)) i PARENT(i)
15
10
7
9
14
3
2
8
1
60Example of INCREASE-KEY
A
16
15
7
9
10
14
3
2
8
1
16
INCREASE-KEY(A, i, key) if key lt Ai then
error(key too small) Ai key while i gt 1
and APARENT(i) lt Ai exchange (Ai,
APARENT(i)) i PARENT(i)
15
10
7
9
14
3
2
8
1
61Heaps as Priority Queues
- INSERT is implemented in O(log n)
- INSERT(A, key)
- Heap_size(A)
- Aheap_size(A) -Infinity
- INCREASE-KEY(A, heap_size(A), key)
INSERT 11
16
16
16
15
10
15
10
15
10
11
9
14
3
7
9
14
3
7
9
14
3
2
8
1
7
2
8
1
2
8
1
11
-?
62Summary