Title: A GENERAL APPROXIMATION TECHNIQUE FOR CONSTRAINED FOREST PROBLEMS
1A GENERAL APPROXIMATION TECHNIQUE FOR
CONSTRAINED FOREST PROBLEMS
- A general approximation technique for graph
problems. - Applies to problems of covering the vertices of a
graph - At minimum cost.
- Satisfying certain requirements .
- Examples of problems that fit in this framework
- shortest path, minimumcost spanning tree,
traveling salesman - and Steiner tree problems.
- This approximation algorithms
- Runs in O(n2 log n) time.
- Comes within a factor of 2 of optimal for most of
these problems.
2Introduction integer program
- Given
- Graph G(V,E).
- Function f 2v 0,1.
- Non negative cost function c E Q
- Integer program
- Min ? cexe
- Subject to e?E
- x(d(S)) f(S) Ø ? S ?
V - xe?0,1 e?E
- d(S) set of edges with exactly one endpoint in
S - x(F) ? e?Fxe.
3Integer program-continue
1 a 2 d e b 3 c
4
S 1,2,3,4,1,2,1,3,1,4,2,3,2,4,
3,4, 1,2,3,1,2,4,1,3,4,2,3,4 d(1
) a,d, d(2) a,b,e, d(1,2) b,d,e,
d(1,3) a,e,c,, d(1,2,3) c,b,
d(2,3,4) a,d, d(1,2,4) e,c,d,
d(1,3,4) a,e,b
- In this covering problem we need to find a
minimumcost set F of - edges such that at least one edge in every
d(S) , corresponding to - sets S with f(s)1 , belongs to F.Â
- For example, Fa,b,c is such a set and it is a
spanning tree, - if f(s) 1 for all Ø ? S ? V.
- The minimal solutions to (IP ) are incidence
vectors of forests. - Let LP denote the linear programming relaxation
of IP, obtained by relaxing the integrality
restriction on the variables xe to xegt0.
4The proper function
- A proper function f 2v 0,1 has the
properties - Symmetry f(S) f(V \ S) for all S?V
- Disjointness If A and B are disjoint, then f(A)
f(B) 0 implies f(A ? B) 0. - f(V ) 0.
- Examples of proper functions and proper
constrained forest problems
1 sns,t1 0 otherwise
1 Ø ?SnT?T 0 otherwise
5The minimum-cost spanning tree problem
- The minimum cost spanning tree can be modeled as
IP with the proper function f(S) 1 ?S - We are looking for
- Min ? cexe
- Subject to e?E
- ? e ? d(S) xe 1 Ø ? S ? V
- xe 0 e?E
- While S ? V , f(S) 1 ,
- because we must have at least
- one edge that cross the cut (S,V\S).
1 a 2 d e b 3 c
4
S
1 2 3 4
V - S
6The shortest s-t path problem
- The shortest s-t path problem can also be modeled
as IP with the proper function f(s) 1 iff
sns,t1 ,meaning iff one vertex exactly , s
or t , is an element of S. - Every cut that separates s from t must be
covered by at least - one edge.
- We are looking for
- Min ? cexe
- Subject to e?E
- ? e ? d(S) xe 1 sns,t1
- ? e ? d(S) xe 0 otherwise
- xe 0 e?E
-
s 3 a
2 5 4
b 8 t
7The Steiner tree problem
- Given
- Undirected graph G(V,E)
- A nonnegative cost ce for e?E
- T ? V terminal set
- Find
- minimum cost set of edges such that all
- terminals are connected
8The Steiner tree problem-cont.
- can also be modeled as IP with the proper
function - f(s)
- We are looking for
- Min ? cexe
- Subject to e?E
- ? e ? d(S) xe 1 Ø ? S nT ? T
- ? e ? d(S) xe 0 otherwise
- xe 0 e?E
1 Ø ?SnT?T 0 otherwise
9The Algorithm for Proper Constrained Forest
Problems - Description
- Input
- undirected graph G (V, E)
- edge costs ce 0 for all e?E
- a proper function f
- Output
- a set of edges F whose incidence vector of
edges is feasible solution for (IP ). - The algorithm maintains a forest F of edges,
which is initially empty. - F is a set of edges that will be candidates to
be output - in every iteration
- select an edge (i j) between two distinct
connected components of F . - merge these two components by adding (i, j) to F
. - The loop terminates when f(C) 0 for all
connected components C of F. - since f(V ) 0, the loop will finish after at
most n-1 iterations. - The set F of edges (output) consists of edges
that are taken from F - if an edge e can be removed from F such that f(C)
0 for all components C of F -e, then e is
omitted from F . - And if e ? F for some connected component C
of ( v, F-e ) , f ( c ) 1 - e is taken to be an edge of F.
10The main algorithm
- F ? Ø
- Comment set ys ? 0 for all S ? V
- LB ? 0
- C ? v v ? V
- For each v ? V
- d(v) ? 0
- While ?C ? C f(C) 1
- Find edge e (i j) with i ? Cp ? C, j Cq ?
C, Cp ? Cq - that minimizes e
- 9. F ? F ?e
- For all v ? Cr ? C do d(v) ? d(v) e f
(Cr ) - Comment set yC ? yC e f (C ) for all
C ? C. - LB ? LB e ?C ? C f(C)
- C ? C ? Cp ? Cq - Cp - Cq
- F ? e ? F For some connected component N
- of (V, F e), f(N) 1.
Initialization
ce- d(i) -d(j)
f(Cp) f(Cq )
Main Loop
Final Step
11The Algorithm variables.
- F set of edges candidate for output.
- F set of output edges.
- C set of connected components C.
- d(v) a variable that implies the increase of yS
. - d(i) ? S i ? S yS , can be shown by
induction. - e feasible increase, affects d(v) and LB.
- ? e ? d(S) yS d(i) d(j), for edge e (i,j).
i,j in different C?C. - Each iteration yS increases by e for active
components, without violating the packing
constraints as long as - d(i) d(j) e f(Cp) e f(Cq) ce. i ? Cp , j
? Cq , Cp ? Cq . - LB lower bound on the optimal cost.
- Corresponds to the dual solution yS LB ? S ?V
yS .
12LP Duality - reminder
? j1
cj xj
n
? i1 bi yi
m
? j1 aij xj bi
n
? i1 aij yi cj
m
?i
?j
yi 0 ?i
xj 0 ?j
1 5 -1 2 3 -1
x1 x2 x3
1 1 3 5 2 1
y1 y2
?
Weak Duality Theorem bTy cTx
Complementary Slackness Conditions x and y are
optimal
3rd lecture page 8
?
Primal ?j, xj gt 0 ? ? i1 aij yi
cj Dual ?i, yi gt 0 ? ? j1 aij xi
bi
13The Dual of LP
- Max ? f(S) ys
- Subject to
- ? ys ce e ? E
- ys 0 Ø ? S ? V
- An active component any component C of F for
which f(C) 1. - In each iteration the algorithm tries to increase
ys uniformly for each active component C by a
value e which is as large as possible without
violating the packing constraints ? ys
ce . - Finding such an e will make a packing
constraint tight for some edge (i , j) between
two distinct components. - the algorithm will then add (i, j) to F and merge
these two components.
S ? V
bTy cTx
S e ? d(S)
(2 2/A) LB
IP opt.
cTx
LP opt.
bTy
LB
S e ? d(S)
LB yS increase due to e.
0
14Example the shortest s-t path problem
a
f(S) 1 iff S n s,t 1
4
3
t
s
5
F sa, at, sb
F sa, at
2
8
b
4
sb
2,0,0,2
2
1,0,1
sb,a,t
1
6
sb,sa
3,0,1,3
1
1,1
s,b,a,t
2
7
sa,sb,at
3.5, 0.5, 1.5, 3.5
0.5
0
s,a,b,t
3
15Example the shortest s-t path problem-cont.
- Each iteration one edge who has exactly one
endpoint in the connected component of s or t
is chosen. - No edges are chosen that have no endpoint in s
or t, else e 8 - assume that e (i, j) with i ? Cp ? C, j
Cq ? C , - Cp ns,t 0 ? f(Cp) 0 , Cq ns,t
0 ? f(Cq) 0 -
- The main loop terminates when s and t are in the
same component. - Final step removing all edges not on the path
from s to t. - Obeys the complementary slackness conditions, (LB
is optimal) - ?S (yS gt 0) ? F n d(S) 1. F n d(S) ? e
? d(S) xe f(S). - ?(e ? F) ? ? S s ? d(S) yS ce. e ? F xe
1(?xe gt 0).
ce - d(i) - d(j)
8
?
f(Cp) f(Cq)
16Example the minimum-cost spanning tree problem
e 3
2
1
f(S) 1 ?S Ø ? S ? V
d 4
b 2
a 1
F a,b,e
4
3
F a,b,e
c 7
0
Ø
0,0,0,0
-
1,1,1,1
1,2,3,4
Init
2
a
0.5, 0.5, 0.5, 0.5
0.5
1,1,1
1,2,3,4
1
3.5
a,b
1, 1, 1, 1
0.5
1,1
1,2,3,4
2
4.5
a,b,e
1.5, 1.5, 1.5, 1.5
0.5
0
1,2,3,4
3
17Example the minimum-cost spanning tree problem
cont.
- The minimumcost spanning tree problem
corresponds to a proper function f(S) 1 for
Ø ? S ? V. - while we havent covered all vertexes , f(S)
1 . - All components will always be active.
- In each iteration the minimumcost edge joining
two components will be selected. - Reduction to Kruskals algorithm.
- produces the optimal minimumcost spanning tree.
- Does not obey the complementary slackness
condition - Usually LB ? optimal solution.
- ?(yS gt 0) ? F n d(S) gt 1.
- In our example ? e ? d(1,2) xe 2 gt f(S) 1.
18Analysis
- A proof that the algorithm has the properties
- The algorithm produces a feasible solution.
- The solution is within a factor of (2 2/A) of
the optimal solution. - F is the set of candidate edges selected by the
algorithm. - F is the forest output by the algorithm.
- x is the incidence vector of edges of F.
19Observation 1
- If f(S) 0 and f(B) 0 for some B ? S,
- then f(S B) 0.
- Proof
- f(V S) f(S) 0. Symmetry.
- f((V S) U B) 0. Disjointness.
- f(S B) f((V S U B) 0. Symmetry.
(V S) U B S B
20Lemma 1
- For each connected component N of F, f(N ) 0.
- Proof
- N ? C for some component C of F. By the
construction of F. - Let e1 , , , ek be edges of F such that ei ? d(N)
(possibly k 0). - Let Ni and C Ni be the two components created
by removing ei from the edges of component C,
with N ? C Ni. - f(Ni ) 0. Since ei ? F - disjointness.
- The sets N , N1 , N2 , , , Nk form a partition of
C. - f(C - N) f(Uk Ni) 0 by disjointness.
- Because f(C) 0, observation 1 implies that f(N
) 0.
i1
C
F F F
N2
e2
Nk
ek
N
N1
e1
21Feasibility proof
- The incidence vector x is a feasible solution to
(IP ). - Proof
- Suppose not, and assume that x(d(S)) 0 for
some S such that f(S) 1. - Let N1 , , , Np be the components of F.
- If x(d(S)) 0 then for all i, either S n Ni Ø
or S n Ni Ni . - Thus S Ni1 U , , , Nik for some i1 , , , ik .
- f(Ni) 0 for all i. By Lemma 1.
- f(S) 0 by the disjointness of f .
- This contradicts our assumption that f(S) 1.
- Therefore, x must be a feasible solution.
22Approximation proof
- A v ? V f(v) 1.
- Let ZLP be the cost of the optimal solution to
(LP). - Let ZIP be the cost of the optimal solution to
(IP ). - The algorithm produces a set of edges F and a
value LB such that - ?ce (2 2/A) LB (2 2/A) ?ys (2
2/A)ZLP (2 2/A)ZIP. - Hence the algorithm is a (2 2/A)approximation
algorithm for the constrained forest problem for
any proper function f . - We use the dual solution y implicitly constructed
by the algorithm. - ZLP ZIP . Relaxation.
- Because y is a feasible dual solution and yS gt 0
only if f(S) 1, - it follows that LB ?ys ZLP
e?F
S?V
S?V
23Proof
- ?ce ? ? ys ? (ys Fn d(S)).
- Proof of the theorem by induction on the main
loop on - ? (ys Fn d(S)) (2 2/A) ? ys.
- True for the 1st iteration Initially all yS
0. - The increase in each iteration must hold the
inequality - ? (e Fn d(C)) (2 2/A) eC1.
- where C is a set of components on an iteration
beginning. - And C1 (C ? C f(C) 1).
e?F
S?V
e?F
S?V
by duality. Different writing of the
same set.
S?V
S?V
C?C f(C) 1
24Induction step proof
- The basic intuition behind the proof
- The average degree of a vertex in a forest of A
vertices is at most 2 2/A. - To begin construct a graph H
- Consider the active and inactive components of
this iteration as vertices of H. - Consider the edges e ? Fn d(C) for all C ? C as
the edges of H. - Remove all isolated vertices in H that correspond
to inactive components. (H is a forest). - We claim that no leaf in H corresponds to an
inactive vertex.
H all vertices edges e? Fn d(S)
Vertex of H. s-t shortest path example
s
t
s
t
H active and inactive components
H after removal of isolated vertices
s
t
s
t
25Proof of claim about H
C1
C
F F F
- suppose otherwise
- Let v be the leaf
- corresponding to inactive vertex.
- Cv its associated inactive component.
- e the edge incident to v.
- C the component of F which contains Cv .
- Let N and C N be the two components
- formed by removing edge e from the edges of
component C. - Without loss of generality, say that Cv ? N .
- The set N - Cv is partitioned by the components
C1 , , , Ck. - Since vertex v is a leaf, no edge in F connects
Cv to any Ci. - f(UCi) 0. By the construction of F.
- Since f(Cv ) 0 also, it follows that f(N ) 0.
- f(C - N ) 0. f(C) 0, Observation 1.
- By the construction of F e ? F, which is a
contradiction.
C2
Cv
v
Ck
N
26End of proof
- In the graph H, the degree dv of vertex v
corresponding to component C must be Fn d(S). - Let Na be the set of vertices in H corresponding
to active components, so that Na C1. - Let Ni be the set of vertices in H that
correspond to inactive components. - Then ? dv ? dv - ? dv 2(Na Ni -
1) - 2Ni 2Na - 2. - This inequality holds since H is a forest with at
most Na Ni - 1 edges. - Each vertex corresponding to an inactive
component has degree at least 2. - Multiplying each side by e, we obtain e ? dv
e (2Na - 2) or - e ? Fn d(S) 2e(C1 - 1) (2 2/A) e
C1. - since the number of active components is always
no more than A. Hence the theorem is proven.
v?Na
v?NaUNi
v?Ni
v?Na
C?C1
27Implementation
- The Algorithm runs in O(min(n2log n,mna(m,n))).
- For practical implementations a(m,n) 4.
- Computing f O(n).O(n) O(n2).
- Maintaining components C as union-find
O(na(n,n)). - The Algorithmic problems are
- Selecting an edge that minimizes e.
- Getting F from F.
- A naive approach use O(ma(m,n)) time each
iteration - To compute the reduced cost e for each edge e
(i j). - To check whether the edge spans two different
components. - Other loop operations take O(n) time.
- The whole running time O(mna(m,n)) for the main
loop. - There are at most n - 1 iterations.
- Not efficient for dense graphs.
28Smart minimum edge cost finding
- Reducing the time to find the minimum edge in
dense graphs to O(n log n). - There are three ideas for this reduced time
bound. - Introduce a notion of time into the algorithm.
- Let the time T be 0 at the beginning of the
algorithm. - Increment T by the value of e each time through
the main loop. - Maintaining a priority queue of edges, where the
key of an edge is the time T at which its reduced
cost is expected to be zero. - we assume that the activity (or inactivity) will
continue indefinitely. - The activity of a component can change
- Only when it is merged with another component.
- Only edges incident to the component are
affected. - In this case, we can recompute the key for each
incident edge. - Delete the element with the old key.
- Reinsert it with the new key.
- For a lower time bound we maintain a single edge
between any 2 components. - If there is more than one edge between any two
components - One of the edges will always have a reduced cost
no greater than that of the others. - The others may be removed from consideration
altogether.
29Implementation of the three ideas
- We sort the edges into a queue (in O(m log n)).
- Each time through the loop
- extract the minimum edge (i,j) of an element from
the queue. - If i?Cp and j?Cq
- We delete all edges incident to Cp and Cq from
the queue. - For each component Cr different from Cp and Cq
- We update the keys of the two edges from Cp to Cr
and Cq to Cr. - Select the one edge that has the minimum key
value - Reinsert That edge into the queue.
- Each iteration O(n) queue insertions and
deletions. - There are at most n components at any point in
time. - Time bound of O(n log n) per iteration
- O(n2 log n) for the entire loop.
30Compute F from F
- We iterate through the components C of F .
- Given a component C
- We root the tree at some vertex.
- Put each leaf of the tree in a separate list.
- Compute the f value for each of the leaves.
- An edge joining a vertex to its parent is
discarded - If the f value for the set of vertices in its
subtree is 0. - Whenever we have computed the f value for all the
children of some vertex v - We concatenate the lists of all the children of
v. - Add v to the list.
- Compute f of the vertices in the list.
- We finish once weve examined every edge in the
tree. - Since there are O(n) edges, the process takes
O(n) time.
31The generalized Steiner tree problem
- Given
- Undirected graph G (V,E)
- A nonnegative cost ce for e?E
- Ti ? V terminal set ( i 1,p)
- Find
- minimum cost set of edges such that for each
- i , all terminals in Ti are connected .
- our approximation algorithm has a performance
guarantee of - (2 2/k) where k ?i1,,p Ti .
- When p 1, the problem reduces to the classical
Steiner - tree problem.
32The generalized Steiner tree problem-cont.
- This problem is a proper constrained forest
- problem with
- f(S)
- We are looking for
- Min ? cexe
- Subject to e?E
- ? e ? d(S) xe 1 Ø ? SnTi ? Ti
- ? e ? d(S) xe 0 otherwise
- xe 0 e?E
- ? Ti s cut at least one edge belongs to F
1 if there exists i ? 1,,p Ø ? SnTi ?
Ti
0 otherwise
33Point to point connection problems
- Given
- Undirected graph G (V,E)
- A nonnegative cost ce for e?E
- A set C c1,, cp of sources
- A set D d1,, dp of destinations
- Find
- A minimumcost set F of edges such that
each sourcedestination pair is connected in F . - The fixed destination case
- ci is required to be connected to di is a
special case of the generalized - Steiner tree problem where Ti ci,, di .
- The nonfixed destination case
- each component of the forest F is required
to contain the same number of sources and
destinations.
34Point to point connection problems cont.The
non-fixed destination case
- This problem is a proper constrained forest
- problem with
- f(S)
- We are looking for
- Min ? cexe
- Subject to e?E
- ? e ? d(S) xe 1 S n C ?
S n D - ? e ? d(S) xe 0 otherwise
- xe 0 e?E
- For this problem we obtain a ( 2- 1/p )-
approximation algorithm. - where p number of vertex in the set
of sources\destinations.
1 S n C ? S n D
0 otherwise
D
C
35Summary
- Our algorithm finds a minimum forest, by merging
components together, and then disposing
unnecessary edges in - O(min(n2log n,mna(m,n))).