Title: Adaptive annealing: a near-optimal
1Adaptive annealing a near-optimal connection
between sampling and counting
Daniel Å tefankovic (University of
Rochester) Santosh Vempala Eric Vigoda (Georgia
Tech)
2Counting
independent sets spanning trees matchings per
fect matchings k-colorings
3Compute the number of
independent sets
(hard-core gas model)
4 independent sets 7
independent set subset S of vertices
no two in S are neighbors
5 independent sets 5598861
independent set subset S of vertices
no two in S are neighbors
6graph G ? independent sets in G
P-complete P-complete even for 3-regular
graphs
(Dyer, Greenhill, 1997)
7graph G ? independent sets in G
?
approximation randomization
8We would like to know Q
Goal random variable Y such that P( (1-?)Q
? Y ? (1?)Q ) ? 1-?
Y gives (1??)-estimate
9(approx) counting ? sampling
Valleau,Card72 (physical chemistry),
Babai79 (for matchings and colorings),
Jerrum,Valiant,V.Vazirani86
the outcome of the JVV reduction
random variables X1 X2 ... Xt
such that
EX1 X2 ... Xt
1)
WANTED
2)
the Xi are easy to estimate
VXi
squared coefficient of variation (SCV)
O(1)
EXi2
10(approx) counting ? sampling
EX1 X2 ... Xt
1)
WANTED
2)
the Xi are easy to estimate
VXi
O(1)
EXi2
Theorem (Dyer-Frieze91)
O(t2/?2) samples (O(t/?2) from each Xi) give
1?? estimator of WANTED with prob?3/4
11JVV for independent sets
GOAL given a graph G, estimate the number of
independent sets of G
1
independent sets
P( )
12P(A?B)P(A)P(BA)
JVV for independent sets
P( )
P( )
P( )
P( )
P( )
X1
X2
X3
X4
VXi
Xi ? 0,1 and EXi ?½ ?
O(1)
EXi2
13Self-reducibility for independent sets
?
P( )
5
?
7
?
14Self-reducibility for independent sets
?
P( )
5
?
7
?
7
5
15Self-reducibility for independent sets
?
P( )
5
?
7
?
7
7
5
5
16Self-reducibility for independent sets
P( )
3
?
5
?
5
3
17Self-reducibility for independent sets
P( )
3
?
5
?
5
5
3
3
18Self-reducibility for independent sets
5
7
7
3
5
5
5
7
3
7
3
5
2
19JVV If we have a sampler oracle
random independent set of G
SAMPLER ORACLE
graph G
then FPRAS using O(n2) samples.
20JVV If we have a sampler oracle
random independent set of G
SAMPLER ORACLE
graph G
then FPRAS using O(n2) samples.
Å VV If we have a sampler oracle
SAMPLER ORACLE
set from gas-model Gibbs at ?
?, graph G
then FPRAS using O(n) samples.
21Application independent sets
O( V ) samples suffice for counting
Cost per sample (Vigoda01,Dyer-Greenhill01)
time O( V ) for graphs of degree ? 4.
Total running time O ( V2
).
22Other applications
matchings O(n2m) (using
Jerrum, Sinclair89) spin systems Ising
model O(n2) for ?lt?C
(using Marinelli, Olivieri95)
k-colorings O(n2) for kgt2?
(using Jerrum95)
total running time
23easy hot
hard cold
24Hamiltonian
4
2
1
0
25Big set ?
Hamiltonian H ? ? 0,...,n
Goal estimate H-1(0)
H-1(0) EX1 ... EXt
26Distributions between hot and cold
- ? inverse temperature
- 0 ? hot ? uniform on ?
- ? ? cold ? uniform on H-1(0)
?? (x) ? exp(-H(x)?)
(Gibbs distributions)
27Distributions between hot and cold
?? (x) ? exp(-H(x)?)
exp(-H(x)?)
?? (x)
Z(?)
Normalizing factor partition function
Z(?) ? exp(-H(x)?)
x??
28Partition function
have Z(0) ? want Z(?) H-1(0)
29Assumption we have a sampler oracle for ??
exp(-H(x)?)
?? (x)
Z(?)
SAMPLER ORACLE
subset of V from ??
graph G ?
30Assumption we have a sampler oracle for ??
exp(-H(x)?)
?? (x)
Z(?)
W ? ??
31Assumption we have a sampler oracle for ??
exp(-H(x)?)
?? (x)
Z(?)
W ? ??
X exp(H(W)(? - ?))
32Assumption we have a sampler oracle for ??
exp(-H(x)?)
?? (x)
Z(?)
W ? ??
X exp(H(W)(? - ?))
can obtain the following ratio
Z(?)
EX ? ??(s) X(s)
Z(?)
s??
33Our goal restated
Partition function
Z(?) ? exp(-H(x)?)
x??
Goal estimate Z(?)H-1(0)
Z(?1) Z(?2) Z(?t)
Z(?)
Z(0)
...
Z(?0) Z(?1) Z(?t-1)
?0 0 lt ?1 lt ? 2 lt ... lt ?t ?
34Our goal restated
Z(?1) Z(?2) Z(?t)
Z(?)
Z(0)
...
Z(?0) Z(?1) Z(?t-1)
Cooling schedule
?0 0 lt ?1 lt ? 2 lt ... lt ?t ?
How to choose the cooling schedule?
minimize length, while satisfying
Z(?i)
VXi
O(1)
EXi
Z(?i-1)
EXi2
35Parameters A and n
Z(?) ? exp(-H(x)?)
x??
Z(0)
A
H? ? 0,...,n
ak H-1(k)
36Parameters
Z(0)
A
H? ? 0,...,n
A
n
2V
E
independent sets matchings perfect matchings
k-colorings
V
? V!
V!
V
kV
E
37Previous cooling schedules
Z(0)
A
H? ? 0,...,n
?0 0 lt ?1 lt ? 2 lt ... lt ?t ?
Safe steps
- ? ? 1/n
- ? ? (1 1/ln A)
- ln A ? ?
(Bezáková,Štefankovic, Vigoda,V.Vazirani06)
Cooling schedules of length
O( n ln A)
(Bezáková,Štefankovic, Vigoda,V.Vazirani06)
O( (ln n) (ln A) )
38No better fixed schedule possible
Z(0)
A
H? ? 0,...,n
A schedule that works for all
- ? n
Za(?) (1 a e )
(with a?0,A-1)
has LENGTH ? ?( (ln n)(ln A) )
39Parameters
Z(0)
A
H? ? 0,...,n
Our main result
can get adaptive schedule of length O ( (ln
A)1/2 )
Previously
non-adaptive schedules of length ?( ln A )
40Related work
can get adaptive schedule of length O ( (ln
A)1/2 )
Lovász-Vempala Volume of convex bodies in
O(n4) schedule of length O(n1/2)
(non-adaptive cooling schedule)
41Existential part
Lemma
for every partition function there exists a
cooling schedule of length O((ln A)1/2)
there exists
42Express SCV using partition function
(going from ? to ?)
W ? ??
X exp(H(W)(? - ?))
EX2
Z(2?-?) Z(?)
? C
EX2
Z(?)2
43?
?
2?-?
f(?)ln Z(?)
Proof
? C(ln C)/2
44 f is decreasing f is convex f(0)
? n f(0) ? ln A
f(?)ln Z(?)
either f or f changes a lot
Proof
Let K?f
1
1
?(ln f) ?
K
45fa,b ? R, convex, decreasing can be
approximated using
f(a)
(f(a)-f(b))
f(b)
segments
46Technicality getting to 2?-?
Proof
?
?
2?-?
47Technicality getting to 2?-?
Proof
?i
?
?
2?-?
?i1
48Technicality getting to 2?-?
Proof
?i
?
?
2?-?
?i2
?i1
49Technicality getting to 2?-?
Proof
ln ln A extra steps
?i
?
?
2?-?
?i2
?i1
?i3
50Existential ? Algorithmic
there exists
51Algorithmic construction
Our main result
using a sampler oracle for ??
we can construct a cooling schedule of length
? 38 (ln A)1/2(ln ln A)(ln n)
Total number of oracle calls ? 107 (ln A) (ln
ln Aln n)7 ln (1/?)
52Algorithmic construction
current inverse temperature ?
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
53Algorithmic construction
current inverse temperature ?
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
X is easy to estimate
54Algorithmic construction
current inverse temperature ?
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
we make progress (assuming B1gt1)
55Algorithmic construction
current inverse temperature ?
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
need to construct a feeler for this
56Algorithmic construction
current inverse temperature ?
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
Z(?)
Z(2?-?)
Z(?)
Z(?)
need to construct a feeler for this
57Algorithmic construction
current inverse temperature ?
bad feeler
ideally move to ? such that
Z(?)
EX2
EX
? B2
B1 ?
Z(?)
EX2
Z(?)
Z(2?-?)
Z(?)
Z(?)
need to construct a feeler for this
58Rough estimator for
ak e-? k
For W ? ?? we have P(H(W)k)
Z(?)
59Rough estimator for
If H(X)k likely at both ?, ? ? rough
estimator
ak e-? k
For W ? ?? we have P(H(W)k)
Z(?)
ak e-? k
For U ? ?? we have P(H(U)k)
Z(?)
60Rough estimator for
ak e-? k
For W ? ?? we have P(H(W)k)
Z(?)
ak e-? k
For U ? ?? we have P(H(U)k)
Z(?)
P(H(U)k) P(H(W)k)
ek(?-?)
61Rough estimator for
d
?
ak e-? k
For W ? ?? we have
P(H(W)?c,d)
kc
Z(?)
62Rough estimator for
If ?-?? d-c ? 1 then
1
P(H(U)?c,d) P(H(W)?c,d)
ec(?-?)
? e
?
e
We also need P(H(U) ? c,d)
P(H(W) ? c,d) to be large.
63Split 0,1,...,n into h ? 4(ln n) ln
A intervals 0,1,2,...,c,c(11/ ln
A),...
for any inverse temperature ? there exists a
interval with P(H(W)? I) ? 1/8h
We say that I is HEAVY for ?
64Algorithm
repeat
find an interval I which is heavy for
the current inverse temperature ? see how far I
is heavy (until some ?) use the interval I for
the feeler
Z(?)
Z(2?-?)
Z(?)
Z(?)
either make progress, or eliminate
the interval I
65Algorithm
repeat
find an interval I which is heavy for
the current inverse temperature ? see how far I
is heavy (until some ?) use the interval I for
the feeler
Z(?)
Z(2?-?)
Z(?)
Z(?)
either make progress, or eliminate
the interval I or make a long move
66if we have sampler oracles for ?? then we can get
adaptive schedule of length tO ( (ln A)1/2 )
independent sets O(n2) (using
Vigoda01, Dyer-Greenhill01) matchings
O(n2m) (using Jerrum,
Sinclair89) spin systems Ising model
O(n2) for ?lt?C (using
Marinelli, Olivieri95) k-colorings
O(n2) for kgt2? (using Jerrum95)
67(No Transcript)
68Appendix proof of
EX1 X2 ... Xt
1)
WANTED
2)
the Xi are easy to estimate
VXi
O(1)
EXi2
Theorem (Dyer-Frieze91)
O(t2/?2) samples (O(t/?2) from each Xi) give
1?? estimator of WANTED with prob?3/4
69The Bienaymé-Chebyshev inequality
P( Y gives (1??)-estimate )
1
VY
? 1 -
?2
EY2
70The Bienaymé-Chebyshev inequality
P( Y gives (1??)-estimate )
1
VY
? 1 -
?2
EY2
squared coefficient of variation SCV
VY
VX
1
EY2
EX2
n
71The Bienaymé-Chebyshev inequality
Let X1,...,Xn,X be independent, identically
distributed random variables, QEX. Let
Then
P( Y gives (1??)-estimate of Q )
1
VX
? 1 -
?2
n EX2
72Chernoffs bound
Let X1,...,Xn,X be independent, identically
distributed random variables, 0 ? X ? 1,
QEX. Let
Then
P( Y gives (1??)-estimate of Q )
- ?2 . n . EX / 3
? 1
e
731
1
VX
n
?2
EX2
?
3
1
ln (1/?)
n
?2
EX
0?X?1
740?X?1
1
1
1
n
?2
EX
?
3
1
ln (1/?)
n
?2
EX
0?X?1
75Median boosting trick
1
4
n
?2
EX
?
P(
) ? 3/4
(1-?)Q
(1?)Q
Y
76Median trick repeat 2T times
(1-?)Q
(1?)Q
?
P(
) ? 3/4
?
-T/4
gt T out of 2T
P(
) ? 1 - e
?
-T/4
median is in
) ? 1 - e
P(
770?X?1
1
32
n
ln (1/?)
?2
EX
median trick
1
3
n
ln (1/?)
?2
EX
0?X?1
78VX
32
n
ln (1/?)
?2
EX2
median trick
1
3
n
ln (1/?)
?2
EX
0?X?1
79Appendix proof of
EX1 X2 ... Xt
1)
WANTED
2)
the Xi are easy to estimate
VXi
O(1)
EXi2
Theorem (Dyer-Frieze91)
O(t2/?2) samples (O(t/?2) from each Xi) give
1?? estimator of WANTED with prob?3/4
80How precise do the Xi have to be?
First attempt Chernoffs bound
81How precise do the Xi have to be?
First attempt Chernoffs bound
Main idea
(1? )(1? )(1? )... (1? ) ? 1??
82How precise do the Xi have to be?
First attempt Chernoffs bound
Main idea
(1? )(1? )(1? )... (1? ) ? 1??
each term ? (t2) samples ? ? (t3) total
83How precise do the Xi have to be?
Bienaymé-Chebyshev is better
(Dyer-Frieze1991)
XX1 X2 ... Xt
squared coefficient of variation (SCV)
GOAL SCV(X) ? ?2/4
P( X gives (1??)-estimate )
1
VX
? 1 -
EX2
?2
84How precise do the Xi have to be?
Bienaymé-Chebyshev is better
(Dyer-Frieze1991)
Main idea
?2/4
SCV(Xi) ?
?
SCV(X) lt ?2/4
?
t
SCV(X) (1SCV(X1)) ... (1SCV(Xt)) - 1
85How precise do the Xi have to be?
Bienaymé-Chebyshev is better
(Dyer-Frieze1991)
Main idea
?2/4
SCV(Xi) ?
?
SCV(X) lt ?2/4
?
t
each term O(t /?2) samples ? O(t2/?2) total
86if we have sampler oracles for ?? then we can get
adaptive schedule of length tO ( (ln A)1/2 )
independent sets O(n2) (using
Vigoda01, Dyer-Greenhill01) matchings
O(n2m) (using Jerrum,
Sinclair89) spin systems Ising model
O(n2) for ?lt?C (using
Marinelli, Olivieri95) k-colorings
O(n2) for kgt2? (using Jerrum95)