Title: Statistical Regimes Across Constrainedness Regions
1Statistical Regimes Across Constrainedness
Regions
- Carla P. Gomes, Cesar Fernandez
- Bart Selman, and Christian Bessiere
- Cornell University
- Universitat de Lleida
- LIRMM-CNRS
- CP 2004
- Toronto
2Motivation
- Bring together recent results on
- Typical Case Analysis
- Randomized Complete Search Methods
- Heavy-Tailed Phenomena
- Random CSP Models
3Typical Case Analysis Beyond NP-Completeness
Phase Transition Phenomenon Discriminating
easy vs. hard instances
of solvable instances
Computational Cost (Mean)
Constrainedness
Hogg et al 96
4Exceptional Hard Instances
- Seem to defy the easy-hard pattern
- such instances occur in the under-constrained
area - they are considerably harder than other similar
instances and even harder than instances from the
critically constrained area.
Gent and Walsh 94 Hogg and Williams 94 Smith and
Grant 97
5Are Exceptionally Hard Instances Truly Hard?
- Different algorithms encounter different
exceptionally hard instances. - Hardness'' of exceptionally hard instances
- ?not necessarily hardness of the instances, but
rather a the combination of the instance with the
details of the search method
Gent and Walsh 94 Hogg and Williams 94 Selman and
Kirkpatrick 96 Smith and Grant 97
6Randomized Backtrack Search
- What if we introduce a tiny element of
randomness into the search heuristic e.g., by
breaking ties randomly --- and run this (still
complete) randomized search procedure on the same
instance over and over again? -
Study of runtime distributions of a randomized
backtrack search on the same instance Way of
isolating the variance caused solely by the
algorithm
Gomes et al CP 97
7Extreme Variance in Runtimeof Randomized
Backtrack Search
Easy instance 15 preassigned cells
Gomes, et al 97
8Heavy-tailed distributions
Exponential decay for standard distributions,
e.g. Normal, Logonormal, exponential
Normal?
Heavy-Tailed Power Law Decay e.g. Pareto-Levy
(Frost et al 97 Gomes et al 97 ,Hoos 1999,Walsh
99,)
9Visualization of Heavy-tailed Phenomenon(Log-Log
Plot of Tail o Distribution)
Heavy-tailed Dist.
1-F(x) Unsolved fraction
Normal (2,1000000)
Normal (2,1)
Runtime (Number of backtracks) (log scale)
10Formal Results
- Abstract Search Tree Models with provably
heavy-tailed behavior (Chen, Gomes, Selman 2001) - Generalization and Assignment of Semantics to the
Abstract Search Tree Models - (Williams, Gomes, Selman 2003)
Provably Polytime Restart Strategies
(Williams, Gomes, Selman 2003)
11What about concrete CSP models?(so far no good
characterization of runtime distributions of
concrete CSP models)
12Research Questions
Concrete CSP Models Complete Randomized Backtrack
Search
- Can we provide a characterization of
heavy-tailed behavior when it occurs and it does
not occur? - Can we identify different tail regimes across
different constrainedness regions? - Can we get further insights into the tail regime
by analyzing the concrete search trees produced
by the backtrack search method?
13Outline of the Rest of the Talk
- Random Binary CSP Models
- Encodings of CSP Models
- Randomized Backtrack Search Algorithms
- Search Trees
- Statistical Tail Regimes Across Cosntrainedness
Regions - Empirical Results
- Theoretical Model
- Conclusions
14Binary Constraint Networks
- A finite binary constraint network
- P (X, D,C)
- a set of n variables X x1, x2, , xn
- For each variable, set of finite domains
- D D(x1), D(x2), , D(xn)
- A set C of binary constraints between pairs of
variables - a constraint Cij, on the ordered set of
variables (xi, xj) is a subset of the Cartesian
product D(xi) x D(xj) that specifies the allowed
combinations of values for the variables xi and
xj. - Solution to the constraint network
- instantiation of the variables such that all
constraints are satisfied.
15Random Binary CSP Models
Model B lt N, D, c, t gt
N number of variables D size of the domains
c number of constrained pairs of variables
p1 proportion of binary constraints included
in network c p1 N ( N-1)/ 2 t tightness of
constraints p2 - proportion of forbidden
tuples t p2 D2
Model E ltN, D, pgt
N number of variables D size of the domains
p proportion of forbidden pairs (out of D2N (
N-1)/ 2)
(Gent et al 1996)
(Achlioptas et al 2000)
N from 15 to 50
(Xu and Li 2000)
16Encodings
- Direct CSP Binary Encoding
- Satisfiability Encoding (direct encoding)
Walsh 2000
17Backtrack Search Algorithms
- Look-ahead performed
- no look-ahead (simple backtracking BT)
- removal of values directly inconsistent with the
last instantiation performed (forward-checking
FC) - arc consistency and propagation (maintaining arc
consistency, MAC). - Different heuristics for variable selection (the
next variable to instantiate) - Random (random)
- variables pre-ordered by decreasing degree in the
constraint graph (deg) - smallest domain first, ties broken by decreasing
degree (domdeg) - Different heuristics for variable value
selection - Random
- Lexicographic
- For the SAT encodings we used the simplified
Davis-Putnam-Logemann-Loveland procedure
Variable/Value static and random
18Inconsistent Subtrees
Bessiere at al 2004
19Distributions
- Runtime distributions of the backtrack search
algorithms - Distribution of the depth of the inconsistency
trees found during the search
All runs were performed without censorship.
20Main Results
- 1 - Runtime distributions
- 2 Inconsistent Sub-tree Depth Distributions
- Dramatically different statistical regimes across
the constrainedness regions of CSP models
21Runtime distributions
22Distribution of Depth of Inconsistent Subtrees
23Applet
24Depth of Inconsistent Search Tree vs. Runtime
Distributions
25Other Models and More Sophisticated Consistency
Techniques
BT
MAC
Model B
Heavy-tailed and non-heavy-tailed regions. As the
sophistication of the algorithm increases the
heavy-tailed region extends to the right,
getting closer to the phase transition
26SAT encoding DPLL
27Theoretical Model
28Depth of Inconsistent Search Tree vs. Runtime
Distributions Theoretical Model
X search cost (runtime) ISTD depth of an
inconsistent sub-tree Pistd IST N
probability of finding an inconsistent sub-tree
of depth N during search PXgtx N
probability of the search cost being larger x,
given an inconsistent tree of depth N
29Depth of Inconsistent Search Tree vs. Runtime
DistributionsTheoretical Model
See paper for proof details
30Regressions for B1, B2, K
Regression for B1 and B2
Regression for k
31Validation Theoretical Model vs. Runtime Data
a 0.26 ? using the model
a 0.27? using runtime data
32Summary of Results
- 1 As constrainedness increases change from
heavy-tailed to a non-heavy-tailed regime - Both models (B and E), CSP and SAT encodings,
for the different backtrack search strategies -
33Summary of Results
- 2 Threshold from the heavy-tailed to
non-heavy-tailed regime - Dependent on the particular search procedure
- As the efficiency of the search method increases,
the extension of the heavy-tailed region
increases the heavy-tailed threshold gets closer
to the phase transition.
34Summary of Results
- 3 Distribution of the depth of inconsistent
search sub-trees -
- Exponentially distributed inconsistent sub-tree
depth (ISTD) combined with exponential growth of
the search space as the tree depth increases
implies heavy-tailed runtime distributions. - As the ISTD distributions move away from the
exponential distribution, the runtime
distributions become non-heavy-tailed.
35Research Challenges
- How to exploit these results in terms of the
design of more efficient search procedures? - Randomization and restart strategies
- Search heuristics
- Look ahead and look back strategies
-
Very exciting and promising research area ?!
36Demos and paperswww.cs.cornell.edu/gomes/http
//fermat.eup.udl.es/cesar/ www.cs.cornell.edu/se
lman/ http//www.lirmm.fr/bessiere/
37Motivation
- Great strides in designing more efficient
complete backtrack search methods for solving
constraint satisfaction problems - strong search heuristics
- Look-ahead and look-back techniques
- Randomization and restarts.
38Motivation
- The study of problem structure --- insights in
terms of the interplay between structure, search
algorithms, and more generally, typical case
complexity - Phase transition phenomena
- Exceptionally hard instances
- Randomized Backtrack Search
- Heavy-tailed phenomena in combinatorial search
-