Disjoint Data Sets - PowerPoint PPT Presentation

About This Presentation
Title:

Disjoint Data Sets

Description:

Backward forest stored in an array. Backward forest with improved height ... find1(x) return Set[x]; (1). union1(repx, repy). smaller min (repx, repy ) ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 24
Provided by: michal85
Category:
Tags: data | disjoint | find1 | sets

less

Transcript and Presenter's Notes

Title: Disjoint Data Sets


1
Disjoint Data Sets
2
This class
  • The methods of disjoint set data structure
  • An application
  • Implementations and improvements
  • An array
  • Backward forest stored in an array
  • Backward forest with improved height
  • Backward forest with improved height and path
    compression

3
Data Structures for Disjoint Sets
  • A disjoint-set data structure is a collection of
    sets S S1 Sk , such that Si Ç Sj Æ for i ?
    j ,
  • The methods are
  • find ( x ) returns a reference to Si ?S such
    that x ? Si
  • merge (x ,y) results in S ? S - Si , Sj ?
    Si ? Sj where x ? Si and y ? Si
  • A merge consists of 2 finds, and a union of two
    sets
  • S a , b , c , d , e
  • Union ( a, d), and update collection S
    a, d , b , c , e

a? find ( a )
d ? find ( d )
4
The Number of Operations
  • Assume
  • Initially there are N sets
  • Each merge reduces the number of sets by 1. So
    the maximum number of merges is N-1.
  • There are n find and m lt N union operations
  • The order in which they are done is unknown
  • Goal We need an implementation that gives an
    optimal aggregate time for a sequence of n m
    operations

5
Application of disjoint-set data structure
  • Problem Find the connected components of a
    graph.
  • 1. Make a set of each vertex.
  • 2. For each edge do if the two end points are
    not in the same set, merge the two sets.
  • At end each set contains the vertices of a
    connected component.
  • We can now answer the question are vertices x
    and y in the same component?

6
Example Find Connected Vertices
G
E (1,2), (1,5), (2,5), (3,4)
1
2
3
merge(1,2) V 1, 2, 3, 4, 5
5
4
merge (1,5) V 1, 2, 5, 3, 4
1. Make a set of each vertex
merge (2,5) V 1, 2, 5, 3, 4
Set of sets of vertices V 1, 2, 3, 4,
5
merge(3,4) V 1, 2, 5, 3,4
2. For each edge in E do
7
Disjoint Set Implementation in an array
  • We can use an array, or a linked list to
    implement the collection. In this lecture we
    examine only an array implementation.
  • The size of the array is N for a total of N
    elements
  • One element is the representative of the set.
  • In the array Set, each element i for i 1,,N
    has the value rep of the representative of its
    set. (Seti rep)
  • We use the smallest value of the elements in a
    set as the representative.

8
Using an Array to implement DS
Set 1, 2, 3, 4, 5, 6, 7, 8
1
2
3
6
4
5
7
8
1 2 3 4 5
6 7 8
merge ( "4", "7") Set 1, 2, 3, 4,7,
5, 6, 8
1
2
3
6
4
5
4
8
1 2 3 4 5
6 7 8
9
DS implemented as an array
  • find1(x)
  • return Setx
  • ?(1).
  • union1(repx, repy).
  • smaller ? min (repx, repy )
  • larger ? max (repx, repy )
  • for k ? 1 to N do if set k larger then
    set k ? smaller
  • ?(N) in every case. After N-1 union operations
    the computation time is ?(N2) which is too slow.

10
DS is implemented as an array
  • For the following sequence of merges we show the
    resulting array
  • Initial array
  • After merge ( 5, 6)
  • After merge ( 4, 5, 6)
  • After merge ( 3, 4, 5, 6)
  • merge ( 2, 3, 4, 5, 6)
  • merge ( 1,2, 3, 4, 5, 6)

1
2
3
4
5
6
1
2
3
4
5
5
1
2
3
4
4
4
1
2
3
3
3
3
1
2
2
2
2
2
1
1
1
1
1
1
1 2 3 4 5 6
11
Backward forests
  • Sets are represented by backward rooted trees,
    with the element in the root representing the set
  • Each node points to its parent in the tree
  • The root points to itself
  • Backward forests can be stored in an array

1
7
1 2 3 4 5 6 7
2
3
1
1
1
3
4
4
7
4
Array representation
5
6
12
Backward forests stored in an array
  • find2(x)
  • rep ? x
  • while (rep ! Set rep )
  • rep ? Set rep
  • return rep
  • find2 is O(height) of the tree in the worst case

(rep1) (set(rep)1)
Examplefinds2(4)
1
7
(rep3) ? ((set(rep)1)
2
3
1 2 3 4 5 6 7
1
1
1
3
4
4
7
(rep4) ? ((set(rep)3)
4
5
6
13
Backward forests stored in an array
  • union2(repx, repy).
  • smaller ? min (repx, repy )
  • larger ? max (repx, repy )
  • set larger ? smaller
  • union2 is O(1)

14
Disjoint-set implemented as forests
  • Example merge2(2,5)
  • find2(2) traverses up one link and returns 1.
    find2(5) traverse up 2 links and returns 3.
  • union2, adds a back link from the root of tree
    with rep 3 to the root of the tree with rep1.

1 2 3 4 5 6
1
1
1
1
3
4
4
1
1
3
3
4
4
?
1
2
1 2 3 4 5 6
2
3
3
4
4
5
6
5
6
15
Disjoint-set implemented as backward forestsWhat
is the worst case height?
  • The following example shows that N - 1 merges may
    create a tree of height N - 1
  • Now N - 1 unions take a total of O( N ) time.
  • n find operations take O( nN ) in the worst case.
  • Initially

16
Disjoint-set implemented as forests
1
  • The order of execution of the "merge2" affects
    the height of the trees.Consider the following
    sequence of mergemerge2 ( 5, 6)merge2 (
    4, 5, 6)merge2 ( 3, 4, 5, 6)merge2 (
    2, 3, 4, 5, 6)merge2 ( 1,2, 3, 4, 5, 6)

2
3
4
Tree of height N -1
5
6
4
3
2
1
1
5
1 2 3 4
5 6
17
Disjoint-set forests with improved height
  • A heuristic to improve time by decreasing the
    height of the trees.
  • Requires another array that contains heights.
    Initialized to 0.
  • We modify union2 to decrease the height of the
    trees to O(lg N) in the worst case.
  • union3 links the root of the tree with the
    smaller height to the root of the tree with the
    larger height.
  • Now find2 O(lgN) and union3 O(1)

18
Disjoint-set forests with improved height
  • union3(repx, repy)
  • if (heightrepx height repy)
  • heightrepx
  • Setrepy ? repx//ys tree points to xs
    tree
  • else
  • if heightrepx gt height repy
  • Setrepy ? repx//ys tree points to
    xs tree
  • else
  • Setrepx ? repy //xs tree points to
    ys tree

19
Merge with reduced height
  • Example merge3(2,5)
  • find2(2) traverses up one link and returns 1.
    find2(5) traverses up 2 links and returns 3.
  • union3, adds a back link from the root of tree of
    height 1 with rep1, to the root of the tree of
    height 2 with rep3.

1 2 3 4 5 6
h(1)1
1
Set
1 2 3 4 5 6
3
1
3
3
4
4
1
1
3
3
4
4
?
height
2
1
0
2
1
0
0
1
0
2
1
0
0
h(3)2
3
3
h(3)2
Set and height
1
4
4
2
5
6
5
6
20
Disjoint-set forests also with path compression
  • Another heuristic to improve time
  • Path compression (done during find3). The nodes
    along a path from x to the root will now point
    directly to the root.
  • This doubles the amount of time of find
  • To save time find3 does not update the height
  • Rank is used instead of height, since the true
    height of the tree may be smaller than the rank
  • Useful when the number of finds n is very large,
    since most of the time find3 will be O(1)

21
Find and compress
Example find3(4)
1
  • find3(x)
  • //find root of tree with x
  • root ? x
  • while (root ! Set root )
  • root ? Set root
  • //compress path from x to root
  • node ? x
  • while (node!root)
  • parent ? Setnode
  • Setnode ? root node points to root
  • node ? parent
  • return root

1
2
2
3
4
3
5
After
4
5
22
Disjoint-set forests with path compression
  • Careful analysis shows that when a sequence of n
    finds and m lt N unions are performed
  • Computation time using path compression becomes
    O((n m)a(n m, n)) where a(n m, n) is the
    inverse of the Ackermann function.
  • The Ackermann function grows very fast. But the
    inverse of the Ackermann function grows more
    slowly than lg n (lg n grows very slowly).
    For all practical n m and n, a(n m, n) 3,
    and time for n finds and m unions is linear in n
    m

23
Summary
  • The worst case time to perform n finds and m lt N
    unions is
  • An array O(n mN)
  • Backward forest stored in an array O(n N m)
  • Backward forest with improved height O(n lgNm)
  • Backward forest with improved height and path
    compression
  • O((n m)a(n m, n))
Write a Comment
User Comments (0)
About PowerShow.com