Title: A Simpler Minimum Spanning Tree Verification Algorithm
1A Simpler Minimum Spanning Tree Verification
Algorithm
- Valerie King
- Algorithmica, 18263270, 1997
D94922004 ??? D96945017 ??? R96922013 ???
R96922028 ??? R96922032 ??? R96922035 ???
2008/05/22
2Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
3Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
4Valerie King
- University of Victoria.
- Research interests
- Randomized algorithms and data structures.
- Applications to networks and computational
biology. - Distributed computing.
5Abstract
- Problem
- Determine whether a given spanning tree in a
graph is a minimal spanning tree. - In 1984, Komlós presented an algorithm
- A linear number of comparisons.
- But nonlinear overhead to determine which
comparisons to make. - This paper simplifies Komlóss algorithm, and
gives a linear time procedure using table lookup
in the unit cost RAM model.
6Related Works(1/2)
- Tarjan (1979)
- Path compression on balanced trees.
- O(m a(m,n)), almost linear running time.
- a(m,n) is a very slowly growing function.
- O(m) storage space.
- Komlós (1984)
- O(m) binary comparisons between edge costs.
- Non-linear time to determine which comparisons to
make.
7Related Works(2/2)
- Dixon, Rauch, and Tarjan (1992)
- O(m), linear time algorithm.
- Combines Tarjans almost-linear-time algorithm
and Komlóss algorithm, with a preprocessing and
table-look-up method for small subproblems. - Decompose the tree into a large subtree and many
microtrees. - Path compression (Tarjan 1979) is used on the
large subtree. - The comparison decision tree needed to implement
Komlóss strategy for each possible microtree is
precomputed and stored in a table. Each microtree
with its query paths is encoded. The table is
used to loop up the appropriate comparisons to
make. - Model of computation binary comparison,
addition, substraction on edge cost on unit-cost
random-access machine.
8Definitions(1/3)
- A graph G (V, E) with n nodes and m edges.
- A path of length k from x to y in G is a sequence
of edges x,v1,v1,v2,,vk-1,y. - A tree T (V, E) is a graph such that T is
connected and contains no cycles. - T is a spanning tree if a tree is a subgraph of G
with the same vertex set as T.
9Definitions(2/3)
- G (V,EG) a graph with w(e) on its edges.
- T (V,ET) is a spanning tree of G.
- T is a minimum spanning tree if is
minimum among all spanning trees of G.
10Definitions(3/3)
- T(x,y) the set of edges in the path in a tree T
from node x to node y. - In a tree T, there is a unique path from x to y.
- A rooted tree is a tree with a distinguished node
called root. - A Full branching tree is a rooted tree with all
leaves on the same level and each internal node
having at least two children. - B(x,y) the set of edges in the path in a full
branching tree B from leaf x to leaf y.
11Key Observation(1/4)
- A spanning tree is a minimum spanning tree iff
the weight of each non-tree edge u, v is at
least the weight of the heaviest edge in the path
in the tree between u and v.
12Key Observation(2/4)
4
3
a
2
1
4
2
5
1
x
b
2
3
T(x,y)
1
2
1
y
13Key Observation(3/4)
4
3
x
a
2
1
4
5
1
y
b
2
3
1
2
1
14Key Observation(4/4)
- These verification methods use this fact.
- Find the heaviest edge in each such path for each
non-tree edge u,v in the graph. - Compare the weight of u,v to it.
1
u
3
2
3
2
u
v
3
3
15Tree Path Problem
- Finding the heaviest edges in the paths between
specified pairs of nodes (query paths).
3
4
1
2
2
16Main Ideas(1/2)
- T is a spanning tree.
- A simple O(n) algorithm to construct a full
branching tree B with no more than 2n edges. - The weight of the heaviest edge in T(x, y) is the
weight of the heaviest edge in B(x, y). - Use the version of the Komlóss algorithm for
full branching trees. - Much simpler than his algorithm for general trees.
17Main Ideas(2/2)
- Linear time implementation using table lookup of
a few simple functions. - Can be constructed in time linear in the size of
the tree. - Model of computation unit random access model
with word size ?(log n) bits. - Allow edge weights to be compared, added, or
subtracted at unit cost.
18Organization
- The construction of full branching tree B
- Proof of the property of B
Komlóss algorithm for determining the maximum
weighted edge in each of m paths of a full
branching tree
- Implementation of Komlóss algorithm
- Data structure
- Algorithm
- Details
- Analysis
19Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
20Spanning Tree T Full Branching Tree B
- Full branching tree rooted tree, leaves in same
level, internal nodes have at least two children. - B has no more than 2n edges.
- O(n) time to construct B.
- The weight of the heaviest edge in T(x, y) is the
weight of the heaviest edge in B(x, y).
T
B
21Use Boruvkas Algorithm (1/3)
- T
- Tree B is constructed with node set W and edge
set F by adding nodes and edges to B. - Initial phase For each node v V, create a
leaf f(v) of B. - B
22Boruvkas Algorithm (2/3)
- Adding phase Let A be the set of blue trees
which are joined into one blue tree t in a phase
i. Add a new node f(t) to W and add edge to
F. - T B
1
2
3
6
4
7
8
5
9
23Boruvkas Algorithm (3/3)
- Repeat edge contraction until there is one blue
tree. - T B
24Boruvkas Algorithm (3.001/3)
- Problem 1 Are edge weights in the path from leaf
to root increased? - No, the weight of the edge which t1 select may
smaller than the edge weight in Tt1 ,for example
- But the weight of the edge which t1 select must
be bigger than the minimal edge weight in Tt1
t1
t2
25Boruvkas Algorithm (3.002/3)
- Problem 2 in adding phase, for each blue tree,
if we choose the edge with the maximum weight,
not minimal weight, does B still hold the
property - For any pair of nodes x and y in T, the weight
of the heaviest edge in T(x, y) equals the weight
of the heaviest edge in B(f(x), f(y)). - No, for example
- the weight of the heaviest edge in T(a, c) 3,
- the weight of the heaviest edge in B(a, c) 5.
T
B
26Boruvkas Tree Property (1/5)
- The number of blue trees drops by a factor of at
least two after each phase. - B is a full branching tree.
- Theorem 1
- For any pair of nodes x and y in T, the weight of
the heaviest edge in T(x, y) equals the weight of
the heaviest edge in B(f(x), f(y)).
27Boruvkas Tree Property (2/5)
- Claim1 for every edge , there
is an edge such that w(e)w(e). - Then a f(t) for some blue tree t which contains
either x or y. - Let e be the edge in T(x, y) with exactly one
endpoint in blue tree t. Since t had the option
of selecting e, w(e)w(e).
T
B
Blue tree t Contains x
28Boruvkas Tree Property (3/5)
- Claim2 Let e be a heaviest edge in T(x, y). Then
there is an edge of the same weight in B(f(x),
f(y)). - e must be selected.
- Case1 If e is selected by a blue tree t which
contains x or y, then an edge in B(f(x), f(y)) is
labeled with w(e).
B
T
Blue tree t contains x
29Boruvkas Tree Property (4/5)
- Claim2 Let e be a heaviest edge in T(x, y). Then
there is an edge of the same weight in B(f(x),
f(y)). - Case2 assume that e is selected by a blue tree
i which does not contain x or y. This blue tree
contained one endpoint of e and thus one
intermediate node on the path from x to y. - Therefore it is incident to at least two
edges on the path. Then e is the heavier of two,
giving a contradiction.
T
B
30Boruvkas Tree Property (5/5)
- Claim1 for every edge , there
is an edge such that w(e)w(e). - Claim2 Let e be a heaviest edge in T(x, y). Then
there is an edge of the same weight in B(f(x),
f(y)). - Theorem For any pair of nodes x and y in T, the
weight of the heaviest edge in T(x, y) equals the
weight of the heaviest edge in B(f(x), f(y)).
31Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
32MST Verification UsingFull Branching Tree
- Given a spanning tree T of a graph G.
- Construct a full branching tree B from T using
Boruvkas algorithm. - For every non-tree edge e u, v in G, find the
heaviest edge e in the query path B(f(u), f(v)),
and see if w(e) w(e) or not. - By Boruvkas tree property.
- If for all non-tree edge e, w(e) w(e), then T
is a minimum spanning tree of G.
33Komlóss Algorithm (1/5)
- Given a full branching tree with n nodes and m
query paths between pairs of leaves, Komlóss
algorithm can compute the heaviest edge in every
query path using only linear number of
comparisons. - For each query path (leaf x -gt leaf y), break up
the path into two half-paths from leaf up to the
lowest common ancestor of the pair. - Find the heaviest edge in each half-path and
compare the two edges to determine the heaviest
edge in the whole path.
34Komlóss Algorithm (2/5)
- A(v) the set of all half paths of every query
path which contain v restricted to the interval
root, v. - Let p be the parent of v. A(vp) the set of
paths in A(v) restricted to the interval root,
p.
r
query paths a-gtd, c-gte, b-gtf A(v) (v-gtp-gts),
(v-gtp) A(vp) (p-gts) A(p) (p-gts),
(p-gts-gtr)
B
s
p
v
a
b
e
f
c
d
35Komlóss Algorithm (3/5)
- If we know the heaviest edge in each path in
A(p), we can determine the heaviest edge in each
path in A(v) through A(vp). - Assume we know the heaviest edge in each path in
A(p). - Because , we already know
the heaviest edge in each path in A(vp). - To determine the heaviest edge in each path in
A(v), we need only to compare w(v, p) to the
heaviest edge in each path in A(vp).
v
p
e1
e2
e3
e4
36Komlóss Algorithm (4/5)
- Starting from the root, descend level by level
and as each node v encountered, the heaviest edge
in each path in A(v) can be determined. - For a query path(x-gty), use A(x) and A(y) to
determine the heaviest edge in each half path,
and compare the two to determine the heaviest
edge in the query path.
37Komlóss Algorithm (5/5)
- The ordering of the weights of the heaviest edges
in A(p) can be determined by the length of their
respective paths, since for any two paths s and t
in A(p), path s includes path t or vice versa. - Compare w(v, p) to each weight in A(vp) can be
done by using binary search. - Komlós shows that the upper bound on the number
of comparisons needed to find the heaviest edge
in each half path is
p
v
e1
w(e4) w(e3) w(e2) w(e1)
e2
e3
w(v, p)
e4
38Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
39Node Label and Edge Tag
- Node Label - bits
- Leaves the order of the results of DFS.
- Internal nodes the longest all 0s suffix in its
subtree.
0000
1000
0000
0100
1000
0000
0010
40Node Label Property
- Nodes on the same level wont possess the same
label.
41Node Label and Edge Tag
- Edge tag - O( ) bits
- v the endpoint which is farther from root.
- distance(v) vs distance from root.
- i(v) index of the rightmost 1 in vs label.
- lt distance(v), i(v) gt
0000
lt 1, 0 gt
lt 1, 4 gt
1000
0000
lt 2, 0 gt
lt 2, 2 gt
lt 2, 3 gt
lt 2, 4 gt
2
0100
1000
0000
0010
lt 3, 4 gt
lt 3, 0 gt
lt 3, 1 gt
lt 3, 2 gt
lt 3, 1 gt
lt 3, 3 gt
lt 3, 1 gt
lt 3, 2 gt
lt 3, 1 gt
3
42(No Transcript)
43LCA Lowest Common Ancestor
- LCA(v)
- A vector with size
- ith bit of LCA(v) 1 iff there is a path in A(v)
whose upper endpoint is at distance i from the
root - For example
- Query paths (u, v), (u, w)
- A(u) (u, a), (u, r)
- LCA(u) 1100
44BigLists and SmallLists (1/3)
- If A(v) gt wordsize / tagsize , A(v) is Big
- Otherwise, A(v) is Small
45BigLists and SmallLists (2/3)
- For example
- Query paths (u, v), (u, w).
- A(u) (u, a), (u, r).
- A1(u) (a, r), A2(u) (p, a).
- A(u) 2 gt wordsize / tagsize 4 / 4 1
- A(u) is big.
1
2
46BigLists and SmallLists (3/3)
- For example
- Query path (c, d).
- A(c) (c, b).
- A1(c) (c, b).
- A(c) 1 1
- A(c) is small.
1
47Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
48Goal of the Algorithm
- Generate bigList(v) or smallList(v) in time
proportional to logA(v). - The time spent implementing Komlóss algorithm at
each node does not exceed the worst case number
of comparisons needed at each node.
49Implementation Details of the Algorithm (1/7)
- The computation of the LCAs.
- Compute all LCAs for each pair of endpoints of
the m query paths. - Form the vector LCA(l) for each leaf l.
- Form the vector LCA(v) for a node at distance i
from the root by ORing together the LCAs of its
children and setting the jth bits to 0 for all
j?i. - Compute all LCAs for each pair of endpoints of
the m query paths using an algorithm that runs in
time - O(n m).
50Implementation Details of the Algorithm (2/7)
100
010
110
000 000 000 000
000 000 000 000
100
100
010
110
51Implementation Details of the Algorithm (3/7)
- Subword
- tagsize bits loglogn.
- Swnum
-
- The maximum number of subwords stored in a word.
52Implementation Details of the Algorithm (4/7)
- selectr
- I and J are two strings of r bits. It outputs a
list of bits of J which have been selected by I. - lte.g.gt select(01010000, 11000000) (10).
- selectSr
- Two inputs I and J. I is a string of no more than
r bits, no more than swnum of which are 1. J is a
list of no more than swnum subwords. It outputs a
list of the subwords of J which have been
selected by I. - lte.g.gt selectS((01), (t1, t2)) (t2).
53Implementation Details of the Algorithm (5/7)
- weightr
- It outputs the number of bits of a string set to
1. - lte.g.gt weight(01010000) 2 .
- indexr
- It outputs the 1s in the r bit vector.
- lte.g.gt index(01100000) (2, 3) (2, 3 are
pointers). - subword1
- The (itagsize)th bit is 1 and the remaining bits
are 0, for i1,, swnum. - lte.g.gt (00100100), where wordsize8, tagsize3.
54Implementation Details of the Algorithm (6/7)
- To implement these operations, we need to
preprocess a few functions so that we may do
table lookup of these functions. - When the size of the input is no greater than
lognc, where c is a constant, the preprocessing
time is O(n).
55Implementation Details of the Algorithm (7/7)
- We cannot afford to build a table for
selectwordsize and selectSwordsize which takes
inputs of 2wordsize bits, since the table would
be too large. - However, we can compute these functions as needed
in constant time using table lookups of those
functions on input size wordsize/2. - A table for all inputs of length r can be built
by first building a table for inputs of size r/2,
looking up the result for the two halves, and, in
constant time, putting the results together to
form the entry.
56Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
57A Top-down Approach
- Initially, A(root) .
- We proceed down the tree, from the parent p to
each of the children v. - Generate the list of the heaviest edges in
A(vp). - Compare w(v, p) to the weights of these edges,
by performing binary search on the list, and
insert the tag of v, p in the appropriate
places to form the list of the heaviest edges in
A(v). - Continue until the leaves are reached.
58An Illustration of the Algorithm
A(p)
A(vp)
A(v)
t
Binary Search
t
Comparison Time O(log A(v)) binary search
time Update Time A(v) time at worst
59A Preliminary Analysis
- We got a naïve time algorithm.
- Observe that only comparisons are
used, but with a non-linear time overhead to
maintain the list. - A linear time algorithm for this problem in a
pointer machine is not thought possible. (such a
result would imply a linear-time algorithm for
computing LCA in a pointer machine) - Our Goal Give a linear time implementation in
the unit cost RAM model, in which an O(log n)
operation can be done in constant time.
60How to Handle a BigList and a SmallList
- Depending on A(v), our goal is that
- if , the overhead of maintaining
the - small list of A(v) is O(1) time.
- if , the overhead of maintaining the
- big list of A(v) is O(log log n) time.
The overall time is therefore dominated by the
number of comparisons.
61Case 1 v is small
- Goal Maintaining the list in O(1) time
- If p is small, then create smallList(vp) from
smallList(p). - L?select(LCA(p),LCA(v)).
- smallList(vp) ?select(L, smallList(p) ).
- lte.g.gt Let LCA(v) (01001000), LCA(p)
(11000000). Let smallList(p) be (t1, t2). Then L
select(11000000, 01001000) (01) and
smallList(vp) selectS((01), (t1, t2)) (t2). - If p is big, then create smallList(vp) from
LCA(v) and LCA(p). - smallList(vp) ?index(select(LCA(p) , LCA(v)).
- lte.g.gt Let LCA(p) (01101110), LCA(v)
(01001000). Then smallList(vp)
index(select(01101110, 01001000)) index(10100)
(1, 3).
62Case 1-1 v is small, p is small
LCA(p) (1 1 0 1 1 0 0 0) (t1,
t2 , t3 , t4) LCA(v) (0 1 0 1 1
0 0 0) Select (0 1 1
1) Retrieve (t2 , t3 , t4) Binary
Search Substitute (t2 , t , t)
A(p)
A(vp)
t
small
t
small
A(v)
Except Binary Search, all operations can be done
in O(1) time.
63Case 1-2 v is small, p is big
LCA(p) (0 1 1 0 1 1 1 0)
(t1, t2 , t3 , t4 , t5) LCA(v) (0 1 0
0 1 0 0 0) Select (1 0
1 0 0) Index (1 ,
3) Binary Search Substitute (1 , t )
A(p)
A(vp)
t
big
t
small
A(v)
Except Binary Search, all operations can be done
in O(1) time.
64Case 2 v is big
- Goal Maintaining the list in O(loglog n) time.
- If v has a big ancestor.
- If p ?a, then create bigList(vp) from
bigList(va) and smallList(p). - lte.g.gt Let LCA(a) (01101110), LCA(v)
(00100101), and let bigList(a) be (t1, t2, t3,
t4, t5). Thus bigList (va) (t2, t4). Suppose
smallList(p) (2, t). Then bigList(vp) (t2,
t). - 2 is pointer to the second item of bigList(a).
- t is the tag of some edge below a in the tree.
- If v doesnt have a big ancestor.
- bigList(vp) ? smallList(p).
65Case 2-1 v is big, there exists a big ancestor a
LCA(a) (0 1 1 0 1 1 1 0)
(t1, t2 , t3 , t4 ,t5) LCA(v) (0 0 1
0 0 1 0 1) Select (0 1
0 1 0) Retrieve (t2)
(t4) () Combination (t2 , t4) (2
, t ) (t2 , t ) Binary
Search Substitute (t2 , t , t)
A(a)
a
big
small
t
A(va)
big
A(p) A(vp)
A(vp)
t
A(v)
Except Binary Search, all operations can be done
in O(log log n) time.
66Case 2-2 v is big, no big ancestor
LCA(p) (0 1 1 0 1 0 0 0)
(t1, t2 , t3) Copy (t1, t2 ,
t3) Binary Search Substitute (t1, t , t
, t )
A(p)
A(vp) A(p)
t
A(v)
small
t
big
Except Binary Search, all operations can be done
in O(log log n) time.
67Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
68Time Complexity
- Preparation time
- LCA query problem O(m n) time.
- Table lookup approach for O(log n)-bit
operations O(n) time. - The overhead for each case does not exceed O(log
A(v)) time, which leads to an - time for maintaining the list.
- Compare the heaviest edges in each half-paths
with the weight of each non-tree edge O(m) time. - Overall time complexity is O(m n) time.
-
69Outline
- Introduction
- Boruvka Tree Property
- Komlóss Algorithm for a Full Branching Tree
- Implementation
- Data Structures
- The Algorithm
- More Details
- Analysis
- Conclusion and Open Problems
70Conclusions and Open Problems
- The paper reduces Komlóss algorithm to the
simpler case of the full branching tree, and
gives an algorithm with linear-time overhead for
its implementation in the unit cost RAM. - Some open problems remain
- Give a linear time algorithm for the MST
verification problem in a pointer machine. - The tree-path problem Given a static tree,
preprocess it so that one can quickly retrieve
the heaviest path in any online query path.
71Thank You