Learning to Search - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Learning to Search

Description:

Expected run time decreases exponentially with variance in # holes per row or column ... Min of 1st derivative of variance in number of uncolored cells across ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 59
Provided by: henry74
Category:

less

Transcript and Presenter's Notes

Title: Learning to Search


1
Learning to Search
  • Henry Kautz
  • University of Washington
  • joint work with
  • Dimitri Achlioptas, Carla Gomes, Eric Horvitz,
    Don Patterson, Yongshao Ruan, Bart Selman
  • CORE MSR, Cornell, UW

2
Speedup Learning
  • Machine learning historically considered
  • Learning to classify objects
  • Learning to search or reason more efficiently
  • Speedup Learning
  • Speedup learning disappeared in mid-90s
  • Last workshop in 1993
  • Last thesis 1998
  • What happened?
  • It failed.
  • It succeeded.
  • Everyone got busy doing something else.

3
It failed.
  • Explanation based learning
  • Examine structure of proof trees
  • Explain why choices were good or bad (wasteful)
  • Generalize to create new control rules
  • At best, mild speedup (50)
  • Could even degrade performance
  • Underlying search engines very weak
  • Etzioni (1993) simple static analysis of
    next-state operators yielded as good performance
    as EBL

4
It succeeded.
  • EBL without generalization
  • Memoization
  • No-good learning
  • SAT clause learning
  • Integrates clausal resolution with DPLL
  • Huge win in practice!
  • Clause-learning proofs can be exponentially
    smaller than best DPLL (tree shaped) proof
  • Chaff (Malik et al 2001)
  • 1,000,000 variable VLSI verification problems

5
Everyone got busy.
  • The something else reinforcement learning.
  • Learn about the world while acting in the world
  • Dont reason or classify, just make decisions
  • What isnt RL?

6
Another path
  • Predictive control of search
  • Learn statistical model of behavior of a problem
    solver on a problem distribution
  • Use the model as part of a control strategy to
    improve the future performance of the solver
  • Synthesis of ideas from
  • Phase transition phenomena in problem
    distributions
  • Decision-theoretic control of reasoning
  • Bayesian modeling

7
Big Picture
runtime
Solver
Problem Instances
Learning / Analysis
static features
Predictive Model
8
Case Study 1 Beyond 4.25
runtime
Solver
Problem Instances
Learning / Analysis
static features
Predictive Model
9
Phase transitions problem hardness
  • Large and growing literature on random problem
    distributions
  • Peak in problem hardness associated with critical
    value of some underlying parameter
  • 3-SAT clause/variable ratio 4.25
  • Using measured parameter to predict hardness of a
    particular instance problematic!
  • Random distribution must be a good model of
    actual domain of concern
  • Recent progress on more realistic random
    distributions...

10
Quasigroup Completion Problem (QCP)
  • NP-Complete
  • Has structure is similar to that of real-world
    problems - tournament scheduling, classroom
    assignment, fiber optic routing, experiment
    design, ...
  • Can generate hard guaranteed SAT instances (2000)

11
Phase Transition
Critically constrained area
Underconstrained area
Overconstrained area
Phase transition
Almost all solvable area
Almost all unsolvable area
Fraction of unsolvable cases
Fraction of pre-assignment
12
Easy-Hard-Easy pattern in local search
Computational Cost
Underconstrained area
Over constrained area
holes
13
Are we ready to predict run times?
  • Problem high variance

log scale
14
Deep structural features
Hardness is also controlled by structure of
constraints, not just the fraction of holes
15
Random versus balanced


Balanced
Random
16
Random versus balanced
Balanced
Random
17
Random vs. balanced (log scale)
Balanced
Random
18
Morphing balanced and random
19
Considering variance in hole pattern
20
Time on log scale
21
Effect of balance on hardness
  • Balanced patterns yield (on average) problems
    that are 2 orders of magnitude harder than random
    patterns
  • Expected run time decreases exponentially with
    variance in holes per row or column
  • E(T) C-ks
  • Same pattern (differ constants) for DPPL!
  • At extreme of high variance (aligned model) can
    prove no hard problems exist

22
Intuitions
  • In unbalanced problems it is easier to identify
    most critically constrained variables, and set
    them correctly
  • Backbone variables

23
Are we done?
  • Unfortunately, not quite.
  • While few unbalanced problems are hard, easy
    balanced problems are not uncommon
  • To do find additional structural features that
    signify hardness
  • Introspection
  • Machine learning (later this talk)
  • Ultimate goal accurate, inexpensive prediction
    of hardness of real-world problems

24
Case study 2 AutoWalksat
runtime
Solver
Problem Instances
Learning / Analysis
Predictive Model
25
Walksat
  • Choose a truth assignment randomly
  • While the assignment evaluates to false
  • Choose an unsatisfied clause at random
  • If possible, flip an unconstrained variable in
    that clause
  • Else with probability P (noise)
  • Flip a variable in the clause randomly
  • Else flip the variable in the clause which causes
    the smallest number of satisfied clauses to
    become unsatisfied.
  • Performance of Walksat is highly sensitive to the
    setting of P

26
(No Transcript)
27
The Invariant Ratio
  • Shortest expected run time when P is set to
    minimize
  • McAllester, Selman and Kautz (1997)

10
7
6
5
4
3
2
1
0
28
Automatic Noise Setting
  • Probe for the optimal noise level
  • Bracketed Search with Parabolic Interpolation
  • No derivatives required
  • Robust to stochastic variations
  • Efficient

29
Hard random 3-SAT
30
3-SAT, probes 1, 2
31
3-SAT, probe 3
32
3-SAT, probe 4
33
3-SAT, probe 5
34
3-SAT, probe 6
35
3-SAT, probe 7
36
3-SAT, probe 8
37
3-SAT, probe 9
38
3-SAT, probe 10
39
Summary random, circuit test, graph coloring,
planning
40
Other features still lurking
  • clockwise add 10 counter-clockwise
    subtract 10
  • More complex function of objective function?
  • Mobility? (Schuurmans 2000)

41
Case Study 3 Restart Policies
runtime
Solver
Problem Instances
Learning / Analysis
static features
Predictive Model
42
Background
  • Backtracking search methods often exhibit a
    remarkable variability in performance between
  • different heuristics
  • same heuristic on different instances
  • different runs of randomized heuristics

43
Cost Distributions
  • Observation (Gomes 1997) distributions often
    have heavy tails
  • infinite variance
  • mean increases without limit
  • probability of long runs decays by power law
    (Pareto-Levy), rather than exponentially (Normal)

44
Randomized Restarts
  • Solution randomize the systematic solver
  • Add noise to the heuristic branching (variable
    choice) function
  • Cutoff and restart search after a some number of
    steps
  • Provably eliminates heavy tails
  • Very useful in practice
  • Adopted by state-of-the art search engines for
    SAT, verification, scheduling,

45
Effect of restarts on expected solution time (log
scale)
46
How to determine restart policy
  • Complete knowledge of run-time distribution
    (only) fixed cutoff policy is optimal (Luby
    1993)
  • argmin t E(Rt) where
  • E(Rt) expected soln time restarting every t
    steps
  • No knowledge of distribution O(log t) of optimal
    using series of cutoffs
  • 1, 1, 2, 1, 1, 2, 4,
  • Open cases addressed by our research
  • Additional evidence about progress of solver
  • Partial knowledge of run-time distribution

47
Backtracking Problem Solvers
  • Randomized SAT solver
  • Satz-Rand, a randomized version of Satz (Li
    Anbulagan 1997)
  • DPLL with 1-step lookahead
  • Randomization with noise parameter for increasing
    variable choices
  • Randomized CSP solver
  • Specialized CSP solver for QCP
  • ILOG constraint programming library
  • Variable choice, variant of Brelaz heuristic

48
Formulation of Learning Problem
  • Different formulations of evidential problem
  • Consider a burst of evidence over initial
    observation horizon
  • Observation horizon time expended so far
  • General observation policies

49
Formulation of Learning Problem
  • Different formulations of evidential problem
  • Consider a burst of evidence over initial
    observation horizon
  • Observation horizon time expended so far
  • General observation policies

Observation horizon Time expended
Observation horizon
Long
Short
Median run time
t1
t2
t3
1000 choice points
50
Formulation of Dynamic Features
  • No simple measurement found sufficient for
    predicting time of individual runs
  • Approach
  • Formulate a large set of base-level and derived
    features
  • Base features capture progress or lack thereof
  • Derived features capture dynamics
  • 1st and 2nd derivatives
  • Min, Max, Final values
  • Use Bayesian modeling tool to select and combine
    relevant features

51
Dynamic Features
  • CSP 18 basic features, summarized by 135
    variables
  • backtracks
  • depth of search tree
  • avg. domain size of unbound CSP variables
  • variance in distribution of unbound CSP variables
  • Satz 25 basic features, summarized by 127
    variables
  • unbound variables
  • variables set positively
  • Size of search tree
  • Effectiveness of unit propagation and lookahead
  • Total of truth assignments ruled out
  • Degree interaction between binary clauses, l

52
Different formulations of task
  • Single instance
  • Solve a specific instance as quickly as possible
  • Learn model from one instance
  • Every instance
  • Solve an instance drawn from a distribution of
    instances
  • Learn model from ensemble of instances
  • Any instance
  • Solve some instance drawn from a distribution of
    instances, may give up and try another
  • Learn model from ensemble of instances

53
Sample Results CSP-QWH-Single
  • QWH order 34, 380 unassigned
  • Observation horizon without time
  • Training Solve 4000 times with random
    Test Solve 1000 times
  • Learning Bayesian network model
  • MS Research tool
  • Structure search with Bayesian information
    criterion (Chickering, et al. )
  • Model evaluation
  • Average 81 accurate at classifying run time vs.
    50 with just background statistics (range of 98
    - 78)

54
(No Transcript)
55
Learned Decision Tree
56
Restart Policies
  • Model can be used to create policies that are
    better than any policy that only uses run-time
    distribution
  • Example
  • Observe for 1,000 steps
  • If run time gt median predicted, restart
    immediately
  • else run until median reached or solution found
  • If no solution, restart.
  • E(Rfixed) 38,000 but E(Rpredict) 27,000
  • Can sometimes beat fixed even if observation
    horizon gt optimal fixed !

57
Ongoing work
  • Optimal predictive policies
  • Dynamic features
  • Run time
  • Static features
  • Partial information about run time distribution
  • E.g. mixture of two or more subclasses of
    problems
  • Cheap approximations to optimal policies
  • Myoptic Bayes

58
Conclusions
  • Exciting new direction for improving power of
    search and reasoning algorithms
  • Many knobs to learn how to twist
  • Noise level, restart policies just a start
  • Lots of opportunities for cross-disciplinary work
  • Theory
  • Machine learning
  • Experimental AI and OR
  • Reasoning under uncertainty
  • Statistical physics
Write a Comment
User Comments (0)
About PowerShow.com