Combinatorial Problems I: Finding Solutions - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

Combinatorial Problems I: Finding Solutions

Description:

4,000 Pages Later: 21. Current SAT solvers solve this instance in. under 30 seconds! ... All vars free. Fix one variable to True or False. Fix another var. Fix ... – PowerPoint PPT presentation

Number of Views:170
Avg rating:3.0/5.0
Slides: 51
Provided by: ashishsa
Category:

less

Transcript and Presenter's Notes

Title: Combinatorial Problems I: Finding Solutions


1
Combinatorial Problems I Finding Solutions
  • Ashish Sabharwal
  • Cornell University
  • March 3, 2008
  • 2nd Asian-Pacific School on Statistical Physics
    and Interdisciplinary Applications
    KITPC/ITP-CAS, Beijing, China

2
Computer Science
Engineering
Mathematics
Cross-fertilization of ideas for the study and
design of Intelligent Systems
Operations Research
Economics
Phase transition
Physics
Cognitive Science
Research part of Cornells Intelligent
Information Systems Institute (IISI) Director
Carla Gomes
3
Combinatorial Problems
  • Examples
  • Routing Given a partially connected networkon
    N nodes, find the shortest path between X and Y
  • Traveling Salesperson Problem (TSP) Given
    apartially connected network on N nodes, find a
    paththat visits every node of the network
    exactly oncemuch harder!!
  • Scheduling Given N tasks with earliest start
    times, completion deadlines, and set of M
    machines on which they can execute, schedule them
    so that they all finish by their deadlines

4
Problem Instance, Algorithm
  • Specific instantiation of the problem
  • E.g. three instances for the routing problem with
    N8 nodes
  • Objective a single, generic algorithm for the
    problem that can solve any instance of that
    problem

A sequence of steps, a recipe
5
Measuring the Effectiveness of Algorithms
  • Capture scaling with input size N, rather than
    runtime on specific instances
  • The most common notion in Computer Science is
    worst-case complexity What is the longest time
    (or number of steps) the algorithm might take on
    any input of size N?Perhaps only N steps, 100
    N5 ?N linear time, O(N)Maybe N2 steps, or N2
    4 N 6 quadratic ,O(N2)Maybe N3 1000 log
    N cubic, O(N3) Maybe 2N, or 2N
    N1000 exponential, O(2N)

6
Polynomial vs. Exponential Complexity
Polynomial time tractable, canhope to solve
very large problemswith enough computing
power E.g. known routing / shortestpath
algorithms O(N3) Exponential time quickly run
intoscalability issues as N increases E.g. best
known algorithms for TSP
7
Are some problems inherently harder than
others?A large amount of work on answering this
question computational complexity theory
8
Computational Complexity Hierarchy
EXP-complete games like Go,
Hard
EXP
PSPACE-complete QBF, adversarial planning,
chess (bounded),
PSPACE
P-complete/hard SAT, sampling,
probabilistic inference,
PP
PH
NP-complete SAT, scheduling, graph
coloring, puzzles,
NP
P-complete circuit-value,
P
In P sorting, shortest path,
Easy
Note widely believed hierarchy know P?EXP for
sure
9
NP-Completeness
  • P class of problems for which a solution can
    be found in poly time e.g. can find a
    shortest path in poly time
  • NP class of problems for which a solution can be
    verified in poly time e.g. cant find a
    TSP solution in poly time (as far as we know)
    but, given a candidate solution (a witness)
    can verify the correctness of the witness
    in poly time N non-deterministic, with
    the power of guessing P polynomial
    time
  • NP-complete the hardest problems within NP

10
NP-Completeness
  • One of the biggest discoveries in Computer
    Science
  • All NP-complete problems are equally hard!
    worst-case complexity
  • An algorithm for any one NP-complete problem can
    be used to solve any other NP-complete problem
    with only a polynomial overhead!
  • There are catalogues of 10,000s of such
    problemse.g. Boolean satisfiability or SAT,
    TSP, scheduling, (bounded) planning, chip
    verification, 0-1 integer programming, graph
    coloring, logical inference,
  • Similarly for PSPACE-complete, P-complete,
    etc.

11
Can one design a single algorithm that can
efficiently solve thousands of different problems
of interest?
12
The Quest for Machine Reasoning
A cornerstone of Artificial Intelligence Objectiv
e Develop foundations and technology to enable
effective, practical, large-scale automated
reasoning.
Current reasoning technology
Machine Reasoning (1960-90s)
Computational complexity of reasoning appears to
severely limit real-world applications
Revisiting the challenge Significant progress
with new ideas / tools for dealing with
complexity (scale-up), uncertainty, and
multi-agent reasoning
13
General Automated Reasoning
GeneralInferenceEngine
ModelGenerator(Encoder)
Probleminstance
Solution
Domain-specific
Generic
e.g. logistics, chess,planning, scheduling, ...
applicable to all domainswithin range of
modeling language
Research objective Better reasoning and
modeling technology
Impact Faster solutions in several domains
14
Reasoning Complexity
  • EXPONENTIAL COMPLEXITY INHERENT
  • AN worst case
  • N No. of Variables/Objects A Object
    states
  • TIME/SPACE
  • ?Granularity ? ? Object states
  • Current implementations trade
  • time with soundness

Search for rules to apply
For N variables 2N cases drive complexity!
Check Contradictions
15
Exponential Complexity Growth The Challenge of
Complex Domains
Note rough estimates, for propositional reasoning
1M 5M
War Gaming
10301,020
0.5M 1M
VLSI Verification
10150,500
Case complexity
100K 450K
Military Logistics
106020
20K 100K
Chess (20 steps deep)
103010
No. of atoms on the earth
10K 50K
Deep space mission control
Seconds until heat death of sun
1047
100 200
1030
Car repair diagnosis
Protein folding Calculation (petaflop-year)
Variables
100
10K
20K
100K
1M
Rules (Constraints)
Credit Kumar, DARPA Cited in Computer World
magazine
16
Progress in Last 15 Years
  • Focus Combinatorial Search Spaces
  • Specifically, the Boolean satisfiability problem,
    SAT
  • Significant progress since the 1990s.
  • How much?
  • Problem size We went from 100 variables, 200
    constraints (early 90s) to 1,000,000 vars. and
    5,000,000 constraints in 15 years.Search space
    from 1015 to 10300,000.Aside one can
    encode quite a bit in 1M variables.
  • Tools 50 competitive SAT solvers available
  • Overview of the state of the art Plenary talk
    at IJCAI-05 (Selman) Discrete App. Math. article
    (Kautz-Selman 06)

17
How Large are the Problems?
A bounded model checking problem
18
SAT Encoding

(automatically generated from problem
specification)
i.e., ((not x1) or x7) ((not x1) or x6)
etc.
x1, x2, x3, etc. are our Boolean variables (to be
set to True or False)
Should x1 be set to False??
19
10 Pages Later


i.e., (x177 or x169 or x161 or x153 x33 or x25
or x17 or x9 or x1 or (not x185)) clauses /
constraints are getting more interesting
Note x1
20
4,000 Pages Later


21
Finally, 15,000 Pages Later

Search space of truth assignments
Current SAT solvers solve this instance in under
30 seconds!
22
SAT Solver Progress
Solvers have continually improved over time
Source Marques-Silva 2002
23
How do SAT Solvers Keep Improving?
  • From academically interesting to practically
    relevant.
  • We now have regular SAT solver competitions.
  • (Germany 89, Dimacs 93, China 96, SAT-02,
    SAT-03, , SAT-07)
  • E.g. at SAT-2006 (Seattle, Aug 06)
  • 35 solvers submitted, most of them open source
  • 500 industrial benchmarks
  • 50,000 benchmark instances available on the www
  • This constant improvement in SAT solvers is the
    key to making, e.g.,SAT-based planning very
    successful.

24
Current Automated Reasoning Tools
  • Most-successful fully automated methods based
    on Boolean Satisfiability (SAT) / Propositional
    Reasoning
  • Problems modeled as rules / constraints over
    Boolean variables
  • SAT solver used as the inference engine
  • Applications single-agent search
  • AI planning
  • SATPLAN-06, fastest optimal planner ICAPS-06
    competition (Kautz Selman 06)
  • Verification hardware and software
  • Major groups at Intel, IBM, Microsoft, and
    universitiessuch as CMU, Cornell, and
    Princeton.SAT has become the dominant
    technology.
  • Many other domains Test pattern generation,
    Scheduling,Optimal Control, Protocol Design,
    Routers, Multi-agent systems,E-Commerce
    (E-auctions and electronic trading agents), etc.

25
Recall General Automated Reasoning
GeneralInferenceEngine
ModelGenerator(Encoder)
Probleminstance
Solution
Domain-specific
Generic
e.g. logistics, chess,planning, scheduling, ...
applicable to all domainswithin range of
modeling language
Research objective Better reasoning and
modeling technology
Impact Faster solutions in several domains
26
Automated Reasoning with SAT
  • A simple but useful modeling language Boolean
    formulas
  • Corresponding inference engine Satisfiability
    or SAT algorithm (e.g. complete search, local
    search, message passing)
  • Numerous applications hardware and software
    verification, planning, scheduling, e-commerce,
    circuit design, open problems in algebra,

27
Boolean Logic
  • Defined over Boolean (binary) variables a, b, c,
  • Each of these can be True (1, T) or False (0, F)
  • Variables connected together with logic
    operators and, or, not (denoted ?)
  • E.g. ((c ? ?d) ? f) is True iff
    either c is True and d is False, or f is True
  • Fact All other Boolean logic operators can be
    expressed with and, or, not
  • E.g. (a ? b) same as (?a or b)
  • Boolean formula, e.g. F (a or b) and ?(a
    and (b or c))
  • (Truth) Assignment any setting of the variables
    to True or False
  • Satisfying assignment assignment where the
    formula evaluates to True
  • E.g. F has 3 satisfying assignments
    (0,1,0), (0,1,1), (1,0,0)

28
Boolean Logic Example
  • F (a or b) and ?(a and (b or c))
  • Note True often written as 1, False as 0
  • There are 23 8 possible truth assignments to a,
    b, c
  • (a0,b1,c0) representing (aFalse, bTrue,
    cFalse)
  • (a0,b0,c1)
  • Exactly 3 truth assignments satisfy F
  • (a0,b1,c0)
  • (a0,b1,c1)
  • (a1,b0,c0)

29
Boolean Logic Expressivity
  • All discrete single-agent search problems can be
    cast as a Boolean formula
  • Variables a, b, c, often represent states of
    the system, events, actions, etc.
  • (more on this later, using Planning as an
    example)
  • Very general encoding language. E.g. can handle
  • Numbers (k-bit binary representation)
  • Floating-point numbers
  • Arithmetic operators like , x, exp(), log()
  • SAT encodings (generated automatically from high
    level languages) routinely used in domains like
    planning, scheduling, verification, e-commerce,
    network design,

Recall Example
event
Variables X1 email_ received X2 in_
meeting X3 urgent X4 respond_to_email X5
near_deadline X6 postpone X7
air_ticket_info_request X8 travel_ request X9
info_request
state
action
  • Rules
  • X1 (not X2) X3 ? X4
  • X2 ? not X4
  • X5 ? X3 or X6
  • 4. X7 ? X8
  • 5. X8 ? X9
  • 6. X8 ? X5
  • 7. X6 ? not X9

constraint
30
Boolean Logic Standard Representations
  • Each problem constraint typically specified as (a
    set of) clauses
  • E.g. (a or b), (c or d or ?f), (?a or c or
    d),
  • Formula in conjunctive normal form, or CNF a
    conjunction of clauses
  • E.g. F (a or b) and ?(a and (b or c))
    changes to
  • FCNF (a or b) and (?a or ?b) and (b
    or ?c)
  • Alternative useful for QBF specify each
    constraint as a term (only and, not)
  • E.g. (a and ?d), (b and ?a and f), (?b and
    d and e),
  • Formula in disjunctive normal form, or DNF a
    disjunction of terms
  • E.g. FDNF (?a and b) or (a and ?b and ?c)

clauses (only or, not)
31
Boolean Satisfiability Testing
  • The Boolean Satisfiability Problem, or SAT
  • Given a Boolean formula F,
  • find a satisfying assignment for F
  • or prove that no such assignment exists.
  • A wide range of applications
  • Relatively easy to test for small formulas (e.g.
    with a Truth Table)
  • However, very quickly becomes hard to solve
  • Search space grows exponentially with formula
    size (more on this next)
  • SAT technology has been very successful in taming
    this exponential blow up!

32
SAT Search Space
All vars free
  • SAT Problem Find a path to a True leaf node.
  • For N Boolean variables, the raw search space is
    of size 2N
  • Grows very quickly with N
  • Brute-force exhaustive search unrealistic without
    efficient heuristics, etc.

33
SAT Solution
All vars free
  • A solution to a SAT problem can be seen as a path
    in the search tree that leads to the formula
    evaluating to True at the leaf.
  • Goal Find such a path efficiently out of the
    exponentially many paths.
  • Note this is a 4 variable example. Imagine a
    tree for 1,000,000 variables!

34
k-CNF, 3-CNF
  • k-CNF all clauses have k literals
  • 1-CNF SAT trivial
  • 2-CNF SAT solvable in O(N2) time N num.
    of variables
  • 3-CNF SAT NP-complete
  • 4-CNF SAT NP-complete

Note Any Boolean formula can be converted into
CNF. -- with or without extra variables (without
? size increase)
35
Worst-Case Complexity
  • SAT is an NP-complete problem
  • Worst-case believed to be exponential(roughly 2N
    for N variables)
  • 10,000 problems in CS are NP-complete (e.g.
    planning, scheduling, protein folding, reasoning)
  • P vs. NP --- 1M Clay Prize
  • However, real-world instances are usually not
    pathological and can often be solved very quickly
    with the latest technology!
  • Typical-case complexity provides a moredetailed
    understanding and a more positive picture.

36
Exponential Complexity Growth
Planning (single-agent) find the right
sequence of actions
HARD 10 actions, 10! 3 x 106 possible plans
Contingency planning (multi-agent) actions
may or may not produce the desired effect!
REALLY HARD 10 x 92 x 84 x 78 x x 2256
10224 possible contingency
plans!
37
Typical-Case Complexity
A key hardness parameter for k-SAT the ratio
of clauses to variables
Problems that are not critically constrained tend
to be much easier in practicethan the relatively
few critically constrained ones
Mitchell, Selman, and Levesque 92 Kirkpatrick
and Selman Science 94
38
Typical-Case Complexity
SAT solvers continually getting close to tackling
problems in the hardest region!
SP (survey propagation) now handles 1,000,000
variablesvery near the phase transition region
39
Tractable Sub-Structure Can Dominate and
Drastically Reduce Solution Cost!
2p-SAT model mix 2-SAT (tractable) and 3-SAT
(intractable) clauses
40 3-SAT exponential scaling
Median runtime
? 40 3-SAT linear scaling!
Number of variables
(Monasson, Selman et al. Nature 99 Achlioptas
00)
40
How are other NP-complete problems translated
into SAT instances?SAT encoding
41
SAT Encoding Example Planning Domain
  • Planning Problem ? Propositional CNF formulaby
    axiom schemas
  • Logistics planning think of a number of trucks
    and planes that need to transport a bunch of
    packages from their origin to their destination
  • Discrete time, modeled by integers
  • state predicates indexed by time at which they
    hold
  • E.g. at_location(x,,loc,i), free(x,i1),
    route(cityA,cityB,i)
  • action predicates indexed by time at which
    action begins
  • E.g. fly(cityA,cityB,i), pickup(x,loc,i),
    drive_truck(loc1,loc2,i)
  • each action takes 1 time step
  • many actions may occur at the same step

42
Encoding Rules
  • Actions imply preconditions and effects
  • fly(x,y,i) ? at(x,i) and route(x,y,i)
    and at(y,i1)
  • Conflicting actions cannot occur at same time (A
    deletes a precondition of B)
  • fly(x,y,i) and y?z ? not fly(x,z,i)
  • If something changes, an action must have caused
    it(Explanatory Frame Axioms)
  • at(x,i) and not at(x,i1) ? ?y .
    route(x,y) and fly(x,y,i)
  • Initial and final states hold
  • at(NY,0) and ... and at(LA,9) and ...

43
Using SAT Solvers for Planning
Modeling and Solving a Planning Problem
instantiated propositional clauses
instantiate
Problem description inhigh level language
axiom schemas
(manual)
length
mapping
SAT engine(s)
interpret
satisfying model
plan
(fully automatic)
44
Planning Benchmark Complexity
  • Logistics domain a complex, highly-parallel
    transportation domain
  • E.g. logistics.d problem
  • 2,165 possible actions per time slot
  • optimal solution contains 74 distinct actions
    over 14 time slots
  • (out of 5 x 1046 possible sequential plans of
    length 14)
  • Satplan Selman et al. approach is currently
    fastest optimal planning approach. Winner
    ICAPS-05 ICAPS-06 international planning
    competitions.

45
Solution Approaches to SAT
46
Solving SAT Systematic Search
  • One possibility enumerate all truth assignments
    one-by-one, test whether any satisfies F
  • Note testing is easy!
  • But too many truth assignments (e.g. for N1000
    variables, have 21000 ? 10300 truth assignments)
  • 00000000
  • 00000001
  • 00000010
  • 00000011
  • 11111111

2N
47
Solving SAT Systematic Search
  • Smarter approach the DPLL procedure 1960s
  • (Davis, Putnam, Logemann, Loveland)
  • Assign values to variables one at a time
    (partial assignments)
  • Simplify F
  • If contradiction (i.e. some clause becomes
    False), backtrack, flip last unflipped
    variables value, and continue search
  • Extended with many new techniques -- 100s of
    research papers, yearly conference on SATe.g.,
    extremely efficient data-structures
    (representation), randomization, restarts,
    learning reasons of failure
  • Provides proof of unsatisfiability if F is unsat.
    complete method
  • Forms the basis of dozens of very effective SAT
    solvers!e.g. minisat, zchaff, relsat, rsat,
    (open source, available on the www)

48
Solving SAT Local Search
  • Search space all 2N truth assignments for F
  • Goal starting from an initial truth assignment
    A0, compute assignments A1, A2, , As such that
    As is a satisfying assignment for F
  • Ai1 is computed by a local transformation to
    Aie.g. A1 000110111 green bit flips to
    red bit A2 001110111 A3
    001110101 A4 101110101
    As 111010000 solution found!
  • No proof of unsatisfiability if F is unsat.
    incomplete method
  • Several SAT solvers based on this approach, e.g.
    Walksat

49
Solving SAT Decimation
  • Search space all 2N truth assignments for F
  • Goal attempt to construct a solution in
    one-shot by very carefully setting one variable
    at a time
  • Survey Inspired Decimation
  • Estimate certain marginal probabilities of each
    variable being True, False, or undecided in
    each solution cluster using Survey Propagation
  • Fix the variable that is the most biased to its
    preferred value
  • Simplify F and repeat
  • A method rarely used by computer scientists
  • But has received tremendous success from the
    physics community on random k-SAT can easily
    solve random instances with 1M variables!
  • No searching for solution
  • No proof of unsatisfiability incomplete method

50
The Next Two Lectures
  • Problems beyond SAT / searching for a single
    solution
  • P-complete count the number of solutions of a
    SAT instance
  • P-hard sample a solution uniformly at random
    for a SAT instance
  • PSPACE-complete quantified Boolean formula (QBF)

51
Thank you for attending!
Slides http//www.cs.cornell.edu/sabhar/tutoria
ls/kitpc08-combinatorial-problems-I.ppt Ashish
Sabharwal http//www.cs.cornell.edu/sabhar Bart
Selman http//www.cs.cornell.edu/selman
Write a Comment
User Comments (0)
About PowerShow.com