Title: Randomization in Graph Optimization Problems
1Randomization in Graph Optimization Problems
- David Karger
- MIT
- http//theory.lcs.mit.edu/karger
2Randomized Algorithms
- Flip coins to decide what to do next
- Avoid hard work of making right choice
- Often faster and simpler than deterministic
algorithms - Different from average-case analysis
- Input is worst case
- Algorithm adds randomness
3Methods
- Random selection
- if most candidate choices good, then a random
choice is probably good - Monte Carlo simulation
- simulations estimate event likelihoods
- Random sampling
- generate a small random subproblem
- solve, extrapolate to whole problem
- Randomized Rounding for approximation
4Cuts in Graphs
- Focus on undirected graphs
- A cut is a vertex partition
- Value is number (or total weight) of crossing
edges
5Optimization with Cuts
- Cut values determine solution of many graph
optimization problems - min-cut / max-flow
- multicommodity flow (sort-of)
- bisection / separator
- network reliability
- network design
- Randomization helps solve these problems
6Presentation Assumption
- For entire presentation, we consider unweighted
graphs (all edges have weight/capacity one) - All results apply unchanged to arbitrarily
weighted graphs - Integer weights parallel edges
- Rational weights scale to integers
- Analysis unaffected
- Some implementation details
7Basic Probability
- Conditional probability
- PrA Ç B PrA PrB A
- Independent events multiply
- PrA Ç B PrA PrB
- Linearity of expectation
- EX Y EX EY
- Union Bound
- PrX È Y PrX PrY
8Random Selection forMinimum Cuts
- Random choices are good
- when problems are rare
9Minimum Cut
- Smallest cut of graph
- Cheapest way to separate into 2 parts
- Various applications
- network reliability (small cuts are weakest)
- subtour elimination constraints for TSP
- separation oracle for network design
- Not s-t min-cut
10Max-flow/Min-cut
- s-t flow edge-disjoint packing of s-t paths
- s-t cut a cut separating s and t
- FF s-t max-flow s-t min-cut
- max-flow saturates all s-t min-cuts
- most efficient way to find s-t min-cuts
- GH min-cut is all-pairs s-t min-cut
- find using n flow computations
11Flow Algorithms
- Push-relabel GT
- push excess around graph till its gone
- max-flow in O(mn) (note O hides logs)
- Recent O(m3/2) GR
- min-cut in O(mn2) --- harder than flow
- Pipelining HO
- save push/relabel data between flows
- min-cut in O(mn) --- as easy as flow
12Contraction
- Find edge that doesnt cross min-cut
- Contract (merge) endpoints to 1 vertex
13Contraction Algorithm
- Repeat n - 2 times
- find non-min-cut edge
- contract it (keep parallel edges)
- Each contraction decrements vertices
- At end, 2 vertices left
- unique cut
- corresponds to min-cut of starting graph
14(No Transcript)
15Picking an Edge
- Must contract non-min-cut edges
- NI O(m) time algorithm to pick edge
- n contractions O(mn) time for min-cut
- slightly faster than flows
- If only could find edge faster.
- Idea min-cut edges are few
16Randomize
- Repeat until 2 vertices remain
- pick a random edge
- contract it
- (keep fingers crossed)
17Analysis I
- Min-cut is small---few edges
- Suppose graph has min-cut c
- Then minimum degree at least c
- Thus at least nc/2 edges
- Random edge is probably safe
- Prmin-cut edge c/(nc/2)
- 2/n
- (easy generalization to capacitated case)
18Analysis II
- Algorithm succeeds if never accidentally
contracts min-cut edge - Contracts vertices from n down to 2
- When k vertices, chance of error is 2/k
- thus, chance of being right is 1-2/k
- Pralways right is product of probabilities of
being right each time
19Analysis III
not too good!
20Repetition
- Repetition amplifies success probability
- basic failure probability 1 - 2/n2
- so repeat 7n2 times
21How fast?
- Easy to perform 1 trial in O(m) time
- just use array of edges, no data structures
- But need n2 trials O(mn2) time
- Simpler than flows, but slower
22An improvement KS
- When k vertices, error probability 2/k
- big when k small
- Idea once k small, change algorithm
- algorithm needs to be safer
- but can afford to be slower
- Amplify by repetition!
- Repeat base algorithm many times
23Recursive Algorithm
- Algorithm RCA ( G, n )
- G has n vertices
- repeat twice
- randomly contract G to n/21/2 vertices
- RCA(G,n/21/2)
24Main Theorem
- On any capacitated, undirected graph, Algorithm
RCA - runs in O(n2) time with simple structures
- finds min-cut with probability ³ 1/log n
- Thus, O(log n) repetitions suffice to find the
minimum cut (failure probability 10-6) in O(n2
log2 n) time.
25Proof Outline
- Graph has O(n2) (capacitated) edges
- So O(n2) work to contract, then two subproblems
of size n/2½ - T(n) 2 T(n/2½) O(n2) O(n2 log n)
- Algorithm fails if both iterations fail
- Iteration succeeds if contractions and recursion
succeed - P(n)1 - 1 - ½ P(n/2½)2 W (1 / log n)
26Failure Modes
- Monte Carlo algorithms always run fast and
probably give you the right answer - Las Vegas algorithms probably run fast and always
give you the right answer - To make a Monte Carlo algorithm Las Vegas, need a
way to check answer - repeat till answer is right
- No fast min-cut check known (flow slow!)
27How do we verify a minimum cut?
28Enumerating Cuts
- The probabilistic method, backwards
29Cut Counting
- Original CA finds any given min-cut with
probability at least 2/n(n-1) - Only one cut found
- Disjoint events, so probabilities add
- So at most n(n-1)/2 min-cuts
- probabilities would sum to more than one
- Tight
- Cycle has exactly this many min-cuts
30Enumeration
- RCA as stated has constant probability of finding
any given min-cut - If run O(log n) times, probability of missing a
min-cut drops to 1/n3 - But only n2 min-cuts
- So, probability miss any at most 1/n
- So, with probability 1-1/n, find all
- O(n2 log3 n) time
31Generalization
- If G has min-cut c, cut ac is a-mincut
- Lemma contraction algorithm finds any given
a-mincut with probability W (n-2a) - Proof just add a factor to basic analysis
- Corollary O(n2a) a-mincuts
- Corollary Can find all in O(n2a) time
- Just change contraction factor in RCA
32Summary
- A simple fast min-cut algorithm
- Random selection avoids rare problems
- Generalization to near-minimum cuts
- Bound on number of small cuts
- Probabilistic method, backwards
33Network Reliability
34The Problem
- Input
- Graph G with n vertices
- Edge failure probabilities
- For simplicity, fix a single p
- Output
- FAIL(p) probability G is disconnected by edge
failures
35Approximation Algorithms
- Computing FAIL(p) is P complete V
- Exact algorithm seems unlikely
- Approximation scheme
- Given G, p, e, outputs e-approximation
- May be randomized
- succeed with high probability
- Fully polynomial (FPRAS) if runtime is polynomial
in n, 1/e
36Monte Carlo Simulation
- Flip a coin for each edge, test graph
- k failures in t trials Þ FAIL(p) k/t
- Ek/t FAIL(p)
- How many trials needed for confidence?
- bad luck on trials can yield bad estimate
- clearly need at least 1/FAIL(p)
- Chernoff bound O(1/e2FAIL(p)) suffice to give
probable accuracy within e - Time O(m/e2FAIL(p))
37Chernoff Bound
- Random variables Xi 0,1
- Sum X å Xi
- Bound deviation from expectation
- Pr X-EX m e EX lt exp(-e2EX/4)
- If EX m 4(log n)/e2, tight concentration
- Deviation by e probability lt 1 / n
- No one variable is a big part of EX
38Application
- Let Xi1 if trial i is a failure, else 0
- Let X X1 Xt
- Then EX t FAIL(p)
- Chernoff says X within relative e of EX with
probability 1-exp(e2 t FAIL(p)/4) - So choose t to cancel other terms
- High probability t O(log n / e2FAIL(p))
- Deviation by e probability lt 1 / n
39Review
- Contraction Algorithm
- O(n2a) a-mincuts
- Enumerate in O(n2a) time
40Network reliability problem
- Random edge failures
- Estimate FAIL(p) Prgraph disconnects
- Naïve Monte Carlo simulation
- Chernoff bound---tight concentration
- Pr X-EX m e EX lt exp(-e2EX/4)
- O(log n / e2FAIL(p)) trials expect O(log n / e2)
network failures---good for Chernoff - So estimate within e in O(m/e2FAIL(p)) time
41Rare Events
- When FAIL(p) too small, takes too long to collect
sufficient statistics - Solution skew trials to make interesting event
more likely - But in a way that lets you recover original
probability
42DNF Counting
- Given DNF formula (OR of ANDs)
- (e1 Ùe2 Ù e3) Ú (e1 Ù e4) Ú (e2 Ù e6)
- Each variable set true with probability p
- Estimate Prformula true
- P-complete
- KL, KLM FPRAS
- Skew to make true outcomes common
- Time linear in formula size
43Rewrite problem
- Assume p1/2
- Count satisfying assignments
- Satisfaction matrix
- Sij1 if ith assignment satisfies jth clause
- We want number of nonzero rows
- Randomly sampling rows wont work
- Might be too few nonzeros
44New sample space
- So normalize every nonzero row to sum to one
(divide by number of nonzeros) - Now sum of nonzeros is desired value
- So sufficient to estimate average nonzero
45Sampling Nonzeros
- We know number of nonzeros/column
- If satisfy given clause, all variables in clause
must be true - All other variables unconstrained
- Estimate average by random sampling
- Know number of nonzeros/column
- So can pick random column
- Then pick random true-for-column assignment
46Few Samples Needed
- Suppose k clauses
- Then Esample gt 1/k
- 1 satisfied clauses k
- 1 ³ sample value ³ 1/k
- Adding O(k log n / e2) samples gives large mean
- So Chernoff says sample mean is probably good
estimate
47Reliability Connection
- Reliability as DNF counting
- Variable per edge, true if edge fails
- Cut fails if all edges do (AND of edge vars)
- Graph fails if some cut does (OR of cuts)
- FAIL(p)Prformula true
Problem the DNF has 2n clauses
48Focus on Small Cuts
- Fact FAIL(p) gt pc
- Theorem if pc1/n(2d) then Prgta-mincut
failslt n-ad - Corollary FAIL(p) Pr a-mincut fails,
- where a12/d
- Recall O(n2a) a-mincuts
- Enumerate with RCA, run DNF counting
49Proof of Theorem
- Given pc1/n(2d)
- At most n2a cuts have value ac
- Each fails with probability pac1/na(2d)
- Prany cut of value ac fails O(n-ad)
- Sum over all a gt 1
50Algorithm
- RCA can enumerate all a-minimum cuts with high
probability in O(n2a) time. - Given a-minimum cuts, can e-estimate probability
one fails via Monte Carlo simulation for
DNF-counting (formula size O(n2a)) - Corollary when FAIL(p)lt n-(2d), can
e-approximate it in O (cn24/d) time
51Combine
- For large FAIL(p), naïve Monte Carlo
- For small FAIL(p), RCA/DNF counting
- Balance e-approx. in O(mn3.5/e2) time
- Implementations show practical for hundreds of
nodes - Again, no way to verify correct
52Summary
- Naïve Monte Carlo simulation works well for
common events - Need to adapt for rare events
- Cut structure and DNF counting lets us do this
for network reliability
53Random Sampling
54Random Sampling
- General tool for faster algorithms
- pick a small, representative sample
- analyze it quickly (small)
- extrapolate to original (representative)
- Speed-accuracy tradeoff
- smaller sample means less time
- but also less accuracy
55Min-cut Duality
- Edmonds min-cutmax tree packing
- convert to directed graph
- source vertex s (doesnt matter which)
- spanning trees directed away from s
- Gabow augmenting trees
- add a tree in O(m) time
- min-cut c (via max packing) in O(mc)
- great if m and c are small
56Example
min-cut 2
2 directed spanning trees
57Sampling for Approximation
58Random Sampling
- Gabow scheme great if m, c small
- Random sampling
- reduces m, c
- scales cut values (in expectation)
- if pick half the edges, get half of each cut
- So find tree packings, cuts in samples
Problem maybe some large deviations
59Sampling Theorem
- Given graph G, build a sample G(p) by including
each edge with probability p - Cut of value v in G has expected value pv in G(p)
- Definition constant r 8 (ln n) / e2
- Theorem With high probability, all cuts in G(r
/ c) have (1 e) times their expected values.
60A Simple Application
- Gabow packs trees in O(mc) time
- Build G(r / c)
- minimum expected cut r
- by theorem, min-cut probably near r
- find min-cut in time O(rm) using Gabow
- corresponds to near-min-cut in G
- Result (1e) times min-cut in O(rm) time
61Proof of Sampling Idea
- Chernoff bound says probability of large
deviation in cut value is small - Problem exponentially many cuts. Perhaps some
deviate a great deal - Solution showed few small cuts
- only small cuts likely to deviate much
- but few, so Chernoff bound applies
62Proof of Sampling
- Sampled with probability r /c,
- a cut of value ac has mean ar
- Chernoff deviates from expected size by more
than e with probability at most n-3a - At most n2a cuts have value ac
- Prany cut of value ac deviates O(n-a)
- Sum over all a ³ 1
63Las Vegas Algorithms
- Finding Good Certificates
64Approximate Tree Packing
- Break edges into c /r random groups
- Each looks like a sample at rate r / c
- O( rm / c) edges
- each has min expected cut r
- so theorem says min-cut (1 e) r
- So each has a packing of size (1 e) r
- Gabow finds in time O(r2m/c) per group
- so overall time is c O(r2m/c) O(r2m)
65Las Vegas Algorithm
- Packing algorithm is Monte Carlo
- Previously found approximate cut (faster)
- If close, each certifies other
- Cut exceeds optimum cut
- Packing below optimum cut
- If not, re-run both
- Result Las Vegas, expected time O(r2m)
66Exact Algorithm
- Randomly partition edges in two groups
- each like a 1/2-sample e O(c-1/2)
- Recursively pack trees in each half
- c/2 - O(c1/2) trees
- Merge packings
- gives packing of size c - O(c1/2)
- augment to maximum packing O(mc1/2)
- T(m,c)2T(m/2,c/2)O(mc1/2) O (mc1/2)
67Nearly Linear Time
68Analyze Trees
- Recall G packs c (directed)-edge disjoint
spanning trees - Corollary in such a packing, some tree crosses
min-cut only twice - To find min-cut
- find tree packing
- find smallest cut with 2 tree edges crossing
- Problem packing takes O(mc) time
69Constraint trees
- Min-cut c
- c directed trees
- 2c directed min-cut edges
- On average, two min-cut edges/tree
- Definitions
- tree 2-crosses cut
70Finding the Cut
- From crossing tree edges, deduce cut
- Remove tree edges
- No other edges cross
- So each component is on one side
- And opposite its neighbors side
71Sampling
- Solution use G(r/c) with e1/8
- pack O(r) trees in O(m) time
- original min-cut has (1e)r edges in G(r / c)
- some tree 2-crosses it in G(r / c)
- and thus 2-crosses it in G
- Analyze O(r) trees in G
- time O(m) per tree
- Monte Carlo
72Simplify
- Discuss case where one tree edge crosses min-cut
73Analyzing a tree
- Root tree, so cut subtree
- Use dynamic program up from leaves to determine
subtree cuts efficiently - Given cuts at children of a node, compute cut at
parent - Definitions
- v are nodes below v
- C(v) is value of cut at subtree v
74The Dynamic Program
u
v
w
Edges with least common ancestor u
keep
discard
75Algorithm 1-Crossing Trees
- Compute edges LCAs O(m)
- Compute cuts at leaves
- Cut values degrees
- each edge incident on at most two leaves
- total time O(m)
- Dynamic program upwards O(n)
- Total O(mn)
762-Crossing Trees
- Cut corresponds to two subtrees
v
w
discard
keep
- n2 table entries
- fill in O(n2) time with dynamic program
77Linear Time
- Bottleneck is C(v, w) computations
- Avoid. Find right twin w for each v
- Compute using addpath and minpath operations of
dynamic trees ST - Result O(m log3 n) time (messy)
78How do we verify a minimum cut?
79Network Design
80Problem Statement
- Given vertices, and cost cvw to buy and edge from
v to w, find minimum cost purchase that creates a
graph with desired connectivity properties - Example minimum cost k-connected graph.
- Generally NP-hard
- Recent approximation algorithms GW,JV
81Integer Linear Program
- Variable xvw1 if buy edge vw
- Solution cost S xvw cvw
- Constraint for every cut, S xvw ³ k
- Relaxing integrality gives tractable LP
- Exponentially many cuts
- But separation oracles exist (eg min-cut)
- What is integrality gap?
82Randomized Rounding
- Given LP solution values xvw
- Build graph where vw is present with probability
xvw - Expected cost is at most opt S xvw cvw
- Expected number of edges crossing any cut
satisfies constraint - If expected number large for every cut, sampling
theorem applies
83k-connected subgraph
- Fractional solution is k-connected
- So every cut has (expected) k edges crossing in
rounded solution - Sampling theorem says every cut has at least k-(k
log n)1/2 edges - Close approximation for large k
- Can often repair e.g., get k-connected subgraph
at cost 1((log n)/k)1/2 times min
84Nonuniform Sampling
- Concentrate on the important things
- Benczur-Karger, Karger, Karger-Levine
85s-t Min-Cuts
- Recall if G has min-cut c, then in G(r/c) all
cuts approximate their expected values to within
e. - Applications
- Trouble if c is small and v large.
86The Problem
- Cut sampling relied on Chernoff bound
- Chernoff bounds required that no one edge is a
large fraction of the expectation of a cut it
crosses - If sample rate 1/c, each edge across a min-cut
is too significant
But if edge only crosses large cuts, then sample
rate 1/c is OK!
87Biased Sampling
- Original sampling theorem weak when
- large m
- small c
- But if m is large
- then G has dense regions
- where c must be large
- where we can sample more sparsely
88 Problem Old Time New Time
- Approx. s-t min-cut O(mn) O(n2 / e2)
- Approx. s-t max-flow O(m3/2 ) O(mn1/2 / e)
- Flow of value v O(mv)
-
O(n11/9v) - Approx. bisection O(m2) O(n2 / e2)
- m Þ n /e2 in weighted, undirected graphs
89Strong Components
- Definition A k-strong component is a maximal
vertex-induced subgraph with min-cut k.
3
2
2
3
90Nonuniform Sampling
- Definition An edge is k-strong if its endpoints
are in same k-component. - Stricter than k-connected endpoints.
- Definition The strong connectivity ce for edge e
is the largest k for which e is k-strong. - Plan sample dense regions lightly
91Nonuniform Sampling
- Idea if an edge is k-strong, then it is in a
k-connected graph - So safe to sample with probability 1/k
- Problem if sample edges with different
probabilities, Ecut value gets messy - Solution if sample e with probability pe, give
it weight 1/pe - Then Ecut valueoriginal cut value
92Compression Theorem
- Definition Given compression probabilities pe,
compressed graph Gpe - includes edge e with probability pe and
- gives it weight 1/pe if included
- Note EGpe G
- Theorem Gr / ce
- approximates all cuts by e
- has O (rn) edges
93Proof (approximation)
- Basic idea in a k-strong component, edges get
sampled with prob. r / k - original sampling theorem works
- Problem some edges may be in stronger
components, sampled less - Induct up from strongest components
- apply original sampling theorem inside
- then freeze so dont affect weaker parts
94Strength Lemma
- Lemma å 1/ce n
- Consider connected component C of G
- Suppose C has min-cut k
- Then every edge e in C has ce ³ k
- So k edges crossing Cs min-cut have
- å 1/ce å 1/k k (1/k ) 1
- Delete these edges (cost 1)
- Repeat n - 1 times no more edges!
95Proof (edge count)
- Edge e included with probability r / ce
- So expected number is S r / ce
- We saw S 1/ce n
- So expected number at most r n
96Construction
- To sample, must find edge strengths
- cant, but approximation suffices
- Sparse certificates identify weak edges
- construct in linear time NI
- contain all edges crossing cuts k
- iterate until strong components emerge
- Iterate for 2i-strong edges, all i
- tricks turn it strongly polynomial
97Certificate Algorithm
- Repeat k times
- Find a spanning forest
- Delete it
- Each iteration deletes one edge from every cut
(forest is spanning) - So at end, any edge crossing a cut of size k is
deleted - NI merge all iterations in O(m) time
98Flows
- Uniform sampling led to flow algorithms
- Randomly partition edges
- Merge flows from each partition element
- Compression problematic
- Edge capacities changed
- So flow path capacities distorted
- Flow in compressed graph doesnt fit in original
graph
99Smoothing
- If edge has strength ce, divide into br / ce
edges of capacity ce /br - Creates br å 1/ce brn edges
- Now each edge is only 1/br fraction of any cut of
its strong component - So sampling a 1/b fraction works
- So dividing into b groups works
- Yields (1-e) max-flow in time
100Cleanup
- Approximate max-flow can be made exact by
augmenting paths - Integrality problems
- Augmenting paths fast for small integer flow
- But breakup by smoothing ruins integrality
- Surmountable
- Flows in dense and sparse parts separable
- Result max-flow in O(n11/9v) time
101Proof By Picture
Dense regions
s
t
102Compress Dense Regions
s
t
103Solve Sparse Flow
s
t
104Replace Dense Parts(keep flow in sparse bits)
s
t
105Fill In Dense Parts
s
t
106Conclusions
107Conclusion
- Randomization is a crucial tool for algorithm
design - Often yields algorithms that are faster or
simpler than traditional counterparts - In particular, gives significant improvements for
core problems in graph algorithms
108Randomized Methods
- Random selection
- if most candidate choices good, then a random
choice is probably good - Monte Carlo simulation
- simulations estimate event likelihoods
- Random sampling
- generate a small random subproblem
- solve, extrapolate to whole problem
- Randomized Rounding for approximation
109Random Selection
- When most choices good, do one at random
- Recursive contraction algorithm for minimum cuts
- Extremely simple (also to implement)
- Fast in theory and in practice CGKLS
110Monte Carlo
- To estimate event likelihood, run trials
- Slow for very rare events
- Bias samples to reveal rare event
- FPRAS for network reliability
111Random Sampling
- Generate representative subproblem
- Use it to estimate solution to whole
- Gives approximate solution
- May be quickly repaired to exact solution
- Bias sample toward important or sensitive
parts of problem - New max-flow and min-cut algorithms
112Randomized Rounding
- Convert fractional to integral solutions
- Get approximation algorithms for integer programs
- Sampling from a well designed sample space of
feasible solutions - Good approximations for network design.
113Generalization
- Our techniques work because undirected graph are
matroids - All our results extend/are special cases
- Packing bases
- Finding minimum quotients
- Matroid optimization (MST)
114Directed Graphs?
- Directed graphs are not matroids
- Directed graphs can have lots of minimum cuts
- Sampling doesnt appear to work
- Residual graphs for flows are directed
- Precludes obvious recursive solutions to flow
problems
115Open problems
- Flow in O(nv) time (complete m Þ n)
- Eliminate v dependence
- Apply to weighted graphs with large flows
- Flow in O(m) time?
- Las Vegas algorithms
- Finding good certificates
- Detrministic algorithms
- Deterministic construction of samples
- Deterministically compress a graph
116Randomization in Graph Optimization Problems
- David Karger
- MIT
- http//theory.lcs.mit.edu/karger
- karger_at_mit.edu