Randomization in Graph Optimization Problems

About This Presentation

Title:

Randomization in Graph Optimization Problems

Description:

Cuts in Graphs. Focus on undirected graphs. A cut is a vertex partition ... Optimization with Cuts. Cut values determine solution of many graph optimization problems: ... – PowerPoint PPT presentation

Number of Views:86

Avg rating:3.0/5.0

Slides: 128

Provided by: DavidK172

Learn more at: http://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: Randomization in Graph Optimization Problems

1
Randomization in Graph Optimization Problems

David Karger
MIT
http//theory.lcs.mit.edu/karger

2
Randomized Algorithms

Flip coins to decide what to do next
Avoid hard work of making right choice
Often faster and simpler than deterministic
algorithms
Different from average-case analysis
Input is worst case
Algorithm adds randomness

3
Methods

Random selection
if most candidate choices good, then a random
choice is probably good
Random sampling
generate a small random subproblem
solve, extrapolate to whole problem
Monte Carlo simulation
simulations estimate event likelihoods
Randomized Rounding for approximation

4
Cuts in Graphs

Focus on undirected graphs
A cut is a vertex partition
Value is number (or total weight) of crossing
edges

5
Optimization with Cuts

Cut values determine solution of many graph
optimization problems
min-cut / max-flow
multicommodity flow (sort-of)
bisection / separator
network reliability
network design
Randomization helps solve these problems

6
Presentation Assumption

For entire presentation, we consider unweighted
graphs (all edges have weight/capacity one)
All results apply unchanged to arbitrarily
weighted graphs
Integer weights parallel edges
Rational weights scale to integers
Analysis unaffected
Some implementation details

7
Basic Probability

Conditional probability
PrA Ç B PrA PrB A
Independent events multiply
PrA Ç B PrA PrB
Union Bound
PrX È Y PrX PrY
Linearity of expectation
EX Y EX EY

8
Random Selection forMinimum Cuts

Random choices are good
when problems are rare

9
Minimum Cut

Smallest cut of graph
Cheapest way to separate into 2 parts
Various applications
network reliability (small cuts are weakest)
subtour elimination constraints for TSP
separation oracle for network design
Not s-t min-cut

10
Max-flow/Min-cut

s-t flow edge-disjoint packing of s-t paths
s-t cut a cut separating s and t
FF s-t max-flow s-t min-cut
max-flow saturates all s-t min-cuts
most efficient way to find s-t min-cuts
GH min-cut is all-pairs s-t min-cut
find using n flow computations

11
Flow Algorithms

Push-relabel GT
push excess around graph till its gone
max-flow in O(mn) (note O hides logs)
recent O(m3/2) GR
min-cut in O(mn2) --- harder than flow
Pipelining HO
save push/relabel data between flows
min-cut in O(mn) --- as easy as flow

12
Contraction

Find edge that doesnt cross min-cut
Contract (merge) endpoints to 1 vertex

13
Contraction Algorithm

Repeat n - 2 times
find non-min-cut edge
contract it (keep parallel edges)
Each contraction decrements vertices
At end, 2 vertices left
unique cut
corresponds to min-cut of starting graph

14
(No Transcript)
15
Picking an Edge

Must contract non-min-cut edges
NI O(m) time algorithm to pick edge
n contractions O(mn) time for min-cut
slightly faster than flows
If only could find edge faster.

Idea min-cut edges are few

16
Randomize

Repeat until 2 vertices remain
pick a random edge
contract it
(keep fingers crossed)

17
Analysis I

Min-cut is small---few edges
Suppose graph has min-cut c
Then minimum degree at least c
Thus at least nc/2 edges
Random edge is probably safe
Prmin-cut edge c/(nc/2)
2/n
(easy generalization to capacitated case)

18
Analysis II

Algorithm succeeds if never accidentally
contracts min-cut edge
Contracts vertices from n down to 2
When k vertices, chance of error is 2/k
thus, chance of being right is 1-2/k
Pralways right is product of probabilities of
being right each time

19
Analysis III
not too good!
20
Repetition

Repetition amplifies success probability
basic failure probability 1 - 2/n2
so repeat 7n2 times

21
How fast?

Easy to perform 1 trial in O(m) time
just use array of edges, no data structures
But need n2 trials O(mn2) time
Simpler than flows, but slower

22
An improvement KS

When k vertices, error probability 2/k
big when k small
Idea once k small, change algorithm
algorithm needs to be safer
but can afford to be slower
Amplify by repetition!
Repeat base algorithm many times

23
Recursive Algorithm

Algorithm RCA ( G, n )
G has n vertices
repeat twice
randomly contract G to n/2½ vertices
RCA(G,n/21/2)

24
Main Theorem

On any capacitated, undirected graph, Algorithm
RCA
runs in O(n2) time with simple structures
finds min-cut with probability ³ 1/log n
Thus, O(log n) repetitions suffice to find the
minimum cut (failure probability 10-6) in O(n2
log2 n) time.

25
Proof Outline

Graph has O(n2) (capacitated) edges
So O(n2) work to contract, then two subproblems
of size n/2½
T(n) 2 T(n/2½) O(n2) O(n2 log n)
Algorithm fails if both iterations fail
Iteration succeeds if contractions and recursion
succeed
P(n)1 - 1 - ½ P(n/2½)2 W (1 / log n)

26
Failure Modes

Monte Carlo algorithms always run fast and
probably give you the right answer
Las Vegas algorithms probably run fast and always
give you the right answer
To make a Monte Carlo algorithm Las Vegas, need a
way to check answer
repeat till answer is right
No fast min-cut check known (flow slow!)

27
How do we verify a minimum cut?
28
Enumerating Cuts

The probabilistic method, backwards

29
Cut Counting

Original CA finds any given min-cut with
probability at least 2/n(n-1)
Only one cut found
Disjoint events, so probabilities add
So at most n(n-1)/2 min-cuts
probabilities would sum to more than one
Tight
Cycle has exactly this many min-cuts

30
Enumeration

RCA as stated has constant probability of finding
any given min-cut
If run O(log n) times, probability of missing a
min-cut drops to 1/n3
But only n2 min-cuts
So, probability miss any at most 1/n
So, with probability 1-1/n, find all
O(n2 log3 n) time

31
Generalization

If G has min-cut c, cut ac is a-mincut
Lemma contraction algorithm finds any given
a-mincut with probability W (n-2a)
Proof just add a factor to basic analysis
Corollary O(n2a) a-mincuts
Corollary Can find all in O(n2a) time
Just change contraction factor in RCA

32
Summary

A simple fast min-cut algorithm
Random selection avoids rare problems
Generalization to near-minimum cuts
Bound on number of small cuts
Probabilistic method, backwards

33
Random Sampling
34
Random Sampling

General tool for faster algorithms
pick a small, representative sample
analyze it quickly (small)
extrapolate to original (representative)
Speed-accuracy tradeoff
smaller sample means less time
but also less accuracy

35
A Polling Problem

Population of size m
Subset of c red members
Goal estimate c
Naïve method check whole population
Faster method sampling
Choose random subset of population
Use relative frequency in sample as estimate for
frequency in population

36
Analysis Chernoff Bound

Random variables Xi Î 0,1
Sum X å Xi
Bound deviation from expectation
Pr X-EX ³ e EX lt exp(-e2EX / 4)
Probably, X Î (1e) EX
If EX ³ 4(ln n)/e2, tight concentration
Deviation by e probability lt 1 / n

37
Application to Polling

Choose each member with probability p
Let X be total number of reds seen
Then EXpc
So estimate c by X/p
Note c accurate to within 1e iff X is within 1e
of expectation
c X/p Î (1e) EX/p (1e) c

38
Analysis

Let Xi1 if ith red item chosen, else 0
Then X å Xi
Chernoff Bound applies
Prdeviation by e lt exp(-e2pc/ 4)
lt 1/n if pc gt 4(log n)/e2
Pretty tight
if pc lt 1, likely no red samples
so no meaningful estimate

39
Sampling for Min-Cuts
40
Min-cut Duality

Edmonds min-cutmax tree packing
convert to directed graph
source vertex s (doesnt matter which)
spanning trees directed away from s
Gabow augmenting trees
add a tree in O(m) time
min-cut c (via max packing) in O(mc)
great if m and c are small

41
Example
min-cut 2
2 directed spanning trees
directed min-cut 2
42
Random Sampling

Gabows algorithm great if m, c small
Random sampling
reduces m, c
scales cut values (in expectation)
if pick half the edges, get half of each cut
So find tree packings, cuts in samples

Problem maybe some large deviations
43
Sampling Theorem

Given graph G, build a sample G(p) by including
each edge with probability p
Cut of value v in G has expected value pv in G(p)
Definition constant r 8 (ln n) / e2
Theorem With high probability, all exponentially
many cuts in G(r / c) have (1 e) times their
expected values.

44
A Simple Application

Gabow packs trees in O(mc) time
Build G(r / c)
minimum expected cut r
by theorem, min-cut probably near r
find min-cut in O(r m) time using Gabow
corresponds to near-min-cut in G
Result (1e) times min-cut in O(m/e2) time

45
Proof of Sampling Idea

Chernoff bound says probability of large
deviation in cut value is small
Problem exponentially many cuts. Perhaps some
deviate a great deal
Solution showed few small cuts
only small cuts likely to deviate much
but few, so Chernoff bound applies

46
Proof of Sampling

Sampled with probability r /c,
a cut of value ac has mean ar
Chernoff deviates from expected size by more
than e with probability at most n-3a
At most n2a cuts have value ac
Prany cut of value ac deviates O(n-a)
Sum over all a ³ 1

47
Las Vegas Algorithms

Finding Good Certificates

48
Approximate Tree Packing

Break edges into c /r random groups
Each looks like a sample at rate r / c
O( rm / c) edges
each has min expected cut r
so theorem says min-cut (1 e) r
So each has a packing of size (1 e) r
Gabow finds in time O(r2m/c) per group
so overall time is (c /r ) O(r2m/c) O(rm)

49
Las Vegas Algorithm

Packing algorithm is Monte Carlo
Previously found approximate cut (faster)
If close, each certifies other
Cut exceeds optimum cut
Packing below optimum cut
If not, re-run both
Result Las Vegas, expected time O(rm)

50
Exact Algorithm

Randomly partition edges in two groups
each like a ½ -sample e O(c-½)
Recursively pack trees in each half
c/2 - O(c½) trees
Merge packings
gives packing of size c - O(c½)
augment to maximum packing O(mc½)
T(m,c)2T(m/2,c/2)O(mc½) O(mc½)

51
Nearly Linear Time
52
Analyze Trees

Recall G packs c (directed)-edge disjoint
spanning trees
Corollary in such a packing, some tree crosses
min-cut only twice
To find min-cut
find tree packing
find smallest cut with 2 tree edges crossing

53
Constraint trees

Min-cut c
c directed trees
2c directed min-cut edges
On average, two min-cut edges/tree
Definitions
tree 2-crosses cut

54
Finding the Cut

From crossing tree edges, deduce cut
Remove tree edges

No other edges cross
So each component is on one side
And opposite its neighbors side

55
Two Problems

Packing trees takes too long
Gabow runtime is O(mc)
Too many trees to check
Only claimed that one (of c) is good
Solution sampling

56
Sampling

Use G(r/c) with e1/8
pack O(r) trees in O(m) time
original min-cut has (1e)r edges in G(r / c)
some tree 2-crosses it in G(r / c)
and thus 2-crosses it in G
Analyze O(r) trees in G
time O(m) per tree
Monte Carlo

57
Simple First Step

Discuss case where one tree edge crosses min-cut

58
Analyzing a tree

Root tree, so cut subtree
Use dynamic program up from leaves to determine
subtree cuts efficiently
Given cuts at children of a node, compute cut at
parent
Definitions
v are nodes below v
C(v) is value of cut at subtree v

59
The Dynamic Program
u
v
w
Edges with least common ancestor u
keep
discard
60
Algorithm 1-Crossing Trees

Compute edges LCAs O(m)
Compute cuts at leaves
Cut values degrees
each edge incident on at most two leaves
total time O(m)
Dynamic program upwards O(n)
Total O(mn)

61
2-Crossing Trees

Cut corresponds to two subtrees

v
w
discard
keep

n2 table entries
fill in O(n2) time with dynamic program

62
Linear Time

Bottleneck is C(v, w) computations
Avoid. Find right twin w for each v

Compute using addpath and minpath operations of
dynamic trees ST
Result O(m log3 n) time (messy)

63
How do we verify a minimum cut?
64
Network Design

Randomized Rounding

65
Problem Statement

Given vertices, and cost cvw to buy and edge from
v to w, find minimum cost purchase that creates a
graph with desired connectivity properties
Example minimum cost k-connected graph.
Generally NP-hard
Recent approximation algorithms GW,JV

66
Integer Linear Program

Variable xvw1 if buy edge vw
Solution cost S xvw cvw
Constraint for every cut, S xvw ³ k
Relaxing integrality gives tractable LP
Exponentially many cuts
But separation oracles exist (eg min-cut)
What is integrality gap?

67
Randomized Rounding

Given LP solution values xvw
Build graph where vw is present with probability
xvw
Expected cost is at most opt S xvw cvw
Expected number of edges crossing any cut
satisfies constraint
If expected number large for every cut, sampling
theorem applies

68
k-connected subgraph

Fractional solution is k-connected
So every cut has (expected) k edges crossing in
rounded solution
Sampling theorem says every cut has at least k-(k
log n)1/2 edges
Close approximation for large k
Can often repair e.g., get k-connected subgraph
at cost 1((log n)/k)1/2 times min

69
Repair Methods

Slightly increase all xvw before rounding
E.g., multiply by (1e)
Works fine, but some xvw become gt 1
Problem if only want single use of edges
Round to approx, then fix
Solve augmentation problem using other network
design techniques
May be worse approx, but only to a small part of
cost

70
Nonuniform Sampling

Concentrate on the important things
Benczur-Karger, Karger, Karger-Levine

71
s-t Min-Cuts

Recall if G has min-cut c, then in G(r/c) all
cuts approximate their expected values to within
e.
Applications

Trouble if c is small and v large.

72
The Problem

Cut sampling relied on Chernoff bound
Chernoff bounds required that no one edge is a
large fraction of the expectation of a cut it
crosses
If sample rate ltlt1/c, each edge across a min-cut
is too significant

But if edge only crosses large cuts, then sample
rate ltlt1/c is OK!
73
Biased Sampling

Original sampling theorem weak when
large m
small c
But if m is large
then G has dense regions
where c must be large
where we can sample more sparsely

74
Problem Old Time New Time

Approx. s-t min-cut O(mv) O(nv / e2)
Approx. s-t min-cut O(mn) O(n2 / e2)
Approx. s-t max-flow O(m3/2 ) O(mn1/2 / e)
Flow of value v O(mv) O(nv)
m Þ n /e2 in weighted, undirected graphs

75
Strong Components

Definition A k-strong component is a maximal
vertex-induced subgraph with min-cut k.

3
2
2
3
76
Nonuniform Sampling

Definition An edge is k-strong if its endpoints
are in same k-component.
Stricter than k-connected endpoints.
Definition The strong connectivity ce for edge e
is the largest k for which e is k-strong.
Plan sample dense regions lightly

77
Nonuniform Sampling

Idea if an edge is k-strong, then it is in a
k-connected graph
So safe to sample with probability 1/k
Problem if sample edges with different
probabilities, Ecut value gets messy
Solution if sample e with probability pe, give
it weight 1/pe
Then Ecut valueoriginal cut value

78
Compression Theorem

Definition Given compression probabilities pe,
compressed graph Gpe
includes edge e with probability pe and
gives it weight 1/pe if included
Note EGpe G
Theorem Gr / ce
approximates all cuts by e
has O (rn) edges

79
Application

Compress graph to rnO(n/e2) edges
Find s-t max-flow in compressed graph
Gives s-t mincut in compressed
So approx. s-t mincut in original
Assorted runtimes
GT O(mn) becomes O(n2/e2)
FF O(mv) becomes O(nv/e2)
GR O(m3/2) become O(n3/2/e3)

80
Proof (approximation)

Basic idea in a k-strong component, edges get
sampled with prob. r / k
original sampling theorem works
Problem some edges may be in stronger
components, sampled less
Induct up from strongest components
apply original sampling theorem inside
then freeze so dont affect weaker parts

81
Strength Lemma

Lemma å 1/ce n
Consider connected component C of G
Suppose C has min-cut k
Then every edge e in C has ce ³ k
So k edges crossing Cs min-cut have
å 1/ce å 1/k k (1/k ) 1
Delete these edges (cost 1)
Repeat n - 1 times no more edges!

82
Proof (edge count)

Edge e included with probability r / ce
So expected number is S r / ce
We saw S 1/ce n
So expected number at most r n

83
Construction

To sample, must find edge strengths
cant, but approximation suffices
Sparse certificates identify weak edges
construct in linear time NI
contain all edges crossing cuts k
iterate until strong components emerge
Iterate for 2i-strong edges, all i
tricks turn it strongly polynomial

84
NI Certificate Algorithm

Repeat k times
Find a spanning forest
Delete it
Each iteration deletes one edge from every cut
(forest is spanning)
So at end, any edge crossing a cut of size k is
deleted
NI pipeline all iterations in O(m) time

85
Approximate Flows

Uniform sampling led to tree algorithms
Randomly partition edges
Merge trees from each partition element
Compression problematic for flow
Edge capacities changed
So flow path capacities distorted
Flow in compressed graph doesnt fit in original
graph

86
Smoothing

If edge has strength ce, divide into br / ce
edges of capacity ce /br
Creates br å 1/ce brn edges
Now each edge is only 1/br fraction of any cut of
its strong component
So sampling a 1/b fraction works
So dividing into b groups works
Yields (1-e) max-flow in O(mn1/2 / e) time

87
Exact Flow Algorithms

Sampling from residual graphs

88
Residual Graphs

Sampling can be used to approximate cuts and
flows
A non-maximum flow can be made maximum by
augmenting paths
But residual graph is directed.
Can sampling help?
Yes, to a limited extent

89
First Try

Suppose current flow value f
residual flow value v-f
Lemma if all edges sampled with probability
rv/c(v-f) then, w.h.p., all directed cuts within
e of expectations
Original undirected sampling used r/c
Expectations nonzero, so no empty cut
So, some augmenting path exists

90
Application

When residual flow i, seek augmenting path in a
sample of mrv/ic edges. Time O(mrv/ic).
Sum over all i from v down to 1
Total O(mrv (log v)/c) since S1/iO(log v)
Here, e can be any constant lt 1 (say ½)
So rO(log n)
Overall runtime O(mv/c)

91
Proof

Augmenting a unit of flow from s to t decrements
residual capacity of each s-t cut by exactly one

s
t
92
Analysis

Each s-t cut loses f edges, had at least v
So, has at least ( v-f ) / v times as many edges
as before
But we increase sampling probability by a factor
of v / ( v-f )
So expected number of sampled edges no worse than
before
So Chernoff and union bound as before

93
Strong Connectivity

Drawback of previous dependence on minimum cut c
Solution use strong connectivities
Initialize a1
Repeat until done
Sample edges with probabilities ar / ke
Look for augmenting path
If dont find, double a

94
Analysis

Theorem if sample with probabilities ar/ke, and
a gt v/(v-f), then will find augmenting path
w.h.p.
Runtime
a always within a factor of 2 of right v/(v-f)
As in compression, edge count O(a r n)
So runtime O(r n Siv/i)O(nv)

95
Summary

Nonuniform sampling for cuts and flows
Approximate cuts in O(n2) time
for arbitrary flow value
Max flow in O(nv) time
only useful for small flow value
but does work for weighted graphs
large flow open

96
Network Reliability

Monte Carlo estimation

97
The Problem

Input
Graph G with n vertices
Edge failure probabilities
For exposition, fix a single p
Output
FAIL(p) probability G is disconnected by edge
failures

98
Approximation Algorithms

Computing FAIL(p) is P complete V
Exact algorithm seems unlikely
Approximation scheme
Given G, p, e, outputs e-approximation
May be randomized
succeed with high probability
Fully polynomial (FPRAS) if runtime is polynomial
in n, 1/e

99
Monte Carlo Simulation

Flip a coin for each edge, test graph
k failures in t trials Þ FAIL(p) k/t
Ek/t FAIL(p)
How many trials needed for confidence?
bad luck on trials can yield bad estimate
clearly need at least 1/FAIL(p)
Chernoff bound O(1/e2FAIL(p)) suffice to give
probable accuracy within e
Time O(m/e2FAIL(p))

100
Chernoff Bound

Random variables Xi Î 0,1
Sum X å Xi
Bound deviation from expectation
Pr X-EX ³ e EX lt exp(-e2EX / 4)
If EX ³ 4(log n)/e2, tight concentration
Deviation by e probability lt 1 / n
No one variable is a big part of EX

101
Application

Let Xi1 if trial i is a failure, else 0
Let X X1 Xt
Then EX t FAIL(p)
Chernoff says X within relative e of EX with
probability 1-exp(e2 t FAIL(p)/4)
So choose t to cancel other terms
High probability t O(log n / e2FAIL(p))
Deviation by e with probability lt 1 / n

102
For network reliability

Random edge failures
Estimate FAIL(p) Prgraph disconnects
Naïve Monte Carlo simulation
Chernoff bound---tight concentration
Pr X-EX e EX lt exp(-e2EX / 4)
O(log n / e2FAIL(p)) trials expect O(log n / e2)
network failures---sufficient for Chernoff
So estimate within e in O(m/e2FAIL(p)) time

103
Rare Events

When FAIL(p) too small, takes too long to collect
sufficient statistics
Solution skew trials to make interesting event
more likely
But in a way that lets you recover original
probability

104
DNF Counting

Given DNF formula (OR of ANDs)
(e1 Ùe2 Ù e3) Ú (e1 Ù e4) Ú (e2 Ù e6)
Each variable set true with probability p
Estimate Prformula true
P-complete
KL, KLM FPRAS
Skew to make true outcomes common
Time linear in formula size

105
Rewrite problem

Assume p1/2
Count satisfying assignments
Satisfaction matrix
Truth table with one column per clause
Sij1 if ith assignment satisfies jth clause
We want number of nonzero rows

106
Satisfaction Matrix
3 nonzero rows
Randomly sampling rows wont work Might be too
few nonzero rows
107
New sample space

Normalize each nonzero row to one
So sum of nonzeros is desired value
Goal estimate average nonzero
Method sample random nonzeros

108
Sampling Nonzeros

We know number of nonzeros/column
If satisfy given clause, all variables in clause
must be true
All other variables unconstrained
Estimate average by random sampling
Know number of nonzeros/column
So can pick random column
Then pick random true-for-column assignment

109
Few Samples Needed

Suppose k clauses
Then Esample gt 1/k
1 satisfied clauses k
1 ³ sample value ³ 1/k
Adding O(k log n / e2) samples gives large mean
So Chernoff says sample mean is probably good
estimate

110
Reliability Connection

Reliability as DNF counting
Variable per edge, true if edge fails
Cut fails if all edges do (AND of edge vars)
Graph fails if some cut does (OR of cuts)
FAIL(p)Prformula true

Problem the DNF has 2n clauses
111
Focus on Small Cuts

Fact FAIL(p) gt pc
Theorem if pc1/n(2d) then Prgta-mincut
failslt n-ad
Corollary FAIL(p) Pr a-mincut fails,
where a12/d
Recall O(n2a) a-mincuts
Enumerate with RCA, run DNF counting

112
Review

Contraction Algorithm
O(n2a) a-mincuts
Enumerate in O(n2a) time

113
Proof of Theorem

Given pc1/n(2d)
At most n2a cuts have value ac
Each fails with probability pac1/na(2d)
Prany cut of value ac fails O(n-ad)
Sum over all a gt 1

114
Algorithm

RCA can enumerate all a-minimum cuts with high
probability in O(n2a) time.
Given a-minimum cuts, can e-estimate probability
one fails via Monte Carlo simulation for
DNF-counting (formula size O(n2a))
Corollary when FAIL(p)lt n-(2d), can
e-approximate it in O (cn24/d) time

115
Combine

For large FAIL(p), naïve Monte Carlo
For small FAIL(p), RCA/DNF counting
Balance e-approx. in O(mn3.5/e2) time
Implementations show practical for hundreds of
nodes
Again, no way to verify correct

116
Summary

Naïve Monte Carlo simulation works well for
common events
Need to adapt for rare events
Cut structure and DNF counting lets us do this
for network reliability

117
Conclusions
118
Conclusion

Randomization is a crucial tool for algorithm
design
Often yields algorithms that are faster or
simpler than traditional counterparts
In particular, gives significant improvements for
core problems in graph algorithms

119
Randomized Methods