Combinatorial Problems I: Finding Solutions

About This Presentation

Title:

Combinatorial Problems I: Finding Solutions

Description:

4,000 Pages Later: 21. Current SAT solvers solve this instance in. under 30 seconds! ... All vars free. Fix one variable to True or False. Fix another var. Fix ... – PowerPoint PPT presentation

Number of Views:171

Avg rating:3.0/5.0

Slides: 51

Provided by: ashishsa

Learn more at: https://www.cs.cornell.edu

Category:

more less

Transcript and Presenter's Notes

Title: Combinatorial Problems I: Finding Solutions

1
Combinatorial Problems I Finding Solutions

Ashish Sabharwal
Cornell University
March 3, 2008
2nd Asian-Pacific School on Statistical Physics
and Interdisciplinary Applications
KITPC/ITP-CAS, Beijing, China

2
Computer Science
Engineering
Mathematics
Cross-fertilization of ideas for the study and
design of Intelligent Systems
Operations Research
Economics
Phase transition
Physics
Cognitive Science
Research part of Cornells Intelligent
Information Systems Institute (IISI) Director
Carla Gomes
3
Combinatorial Problems

Examples
Routing Given a partially connected networkon
N nodes, find the shortest path between X and Y
Traveling Salesperson Problem (TSP) Given
apartially connected network on N nodes, find a
paththat visits every node of the network
exactly oncemuch harder!!
Scheduling Given N tasks with earliest start
times, completion deadlines, and set of M
machines on which they can execute, schedule them
so that they all finish by their deadlines

4
Problem Instance, Algorithm

Specific instantiation of the problem
E.g. three instances for the routing problem with
N8 nodes
Objective a single, generic algorithm for the
problem that can solve any instance of that
problem

A sequence of steps, a recipe
5
Measuring the Effectiveness of Algorithms

Capture scaling with input size N, rather than
runtime on specific instances
The most common notion in Computer Science is
worst-case complexity What is the longest time
(or number of steps) the algorithm might take on
any input of size N?Perhaps only N steps, 100
N5 ?N linear time, O(N)Maybe N2 steps, or N2
4 N 6 quadratic ,O(N2)Maybe N3 1000 log
N cubic, O(N3) Maybe 2N, or 2N
N1000 exponential, O(2N)

6
Polynomial vs. Exponential Complexity
Polynomial time tractable, canhope to solve
very large problemswith enough computing
power E.g. known routing / shortestpath
algorithms O(N3) Exponential time quickly run
intoscalability issues as N increases E.g. best
known algorithms for TSP
7
Are some problems inherently harder than
others?A large amount of work on answering this
question computational complexity theory
8
Computational Complexity Hierarchy
EXP-complete games like Go,
Hard
EXP
PSPACE-complete QBF, adversarial planning,
chess (bounded),
PSPACE
P-complete/hard SAT, sampling,
probabilistic inference,
PP
PH
NP-complete SAT, scheduling, graph
coloring, puzzles,
NP
P-complete circuit-value,
P
In P sorting, shortest path,
Easy
Note widely believed hierarchy know P?EXP for
sure
9
NP-Completeness

P class of problems for which a solution can
be found in poly time e.g. can find a
shortest path in poly time
NP class of problems for which a solution can be
verified in poly time e.g. cant find a
TSP solution in poly time (as far as we know)
but, given a candidate solution (a witness)
can verify the correctness of the witness
in poly time N non-deterministic, with
the power of guessing P polynomial
time
NP-complete the hardest problems within NP

10
NP-Completeness

One of the biggest discoveries in Computer
Science
All NP-complete problems are equally hard!
worst-case complexity
An algorithm for any one NP-complete problem can
be used to solve any other NP-complete problem
with only a polynomial overhead!
There are catalogues of 10,000s of such
problemse.g. Boolean satisfiability or SAT,
TSP, scheduling, (bounded) planning, chip
verification, 0-1 integer programming, graph
coloring, logical inference,
Similarly for PSPACE-complete, P-complete,
etc.

11
Can one design a single algorithm that can
efficiently solve thousands of different problems
of interest?
12
The Quest for Machine Reasoning
A cornerstone of Artificial Intelligence Objectiv
e Develop foundations and technology to enable
effective, practical, large-scale automated
reasoning.
Current reasoning technology
Machine Reasoning (1960-90s)
Computational complexity of reasoning appears to
severely limit real-world applications
Revisiting the challenge Significant progress
with new ideas / tools for dealing with
complexity (scale-up), uncertainty, and
multi-agent reasoning
13
General Automated Reasoning
GeneralInferenceEngine
ModelGenerator(Encoder)
Probleminstance
Solution
Domain-specific
Generic
e.g. logistics, chess,planning, scheduling, ...
applicable to all domainswithin range of
modeling language
Research objective Better reasoning and
modeling technology
Impact Faster solutions in several domains
14
Reasoning Complexity

EXPONENTIAL COMPLEXITY INHERENT
AN worst case
N No. of Variables/Objects A Object
states
TIME/SPACE
?Granularity ? ? Object states
Current implementations trade
time with soundness

Search for rules to apply
For N variables 2N cases drive complexity!
Check Contradictions
15
Exponential Complexity Growth The Challenge of
Complex Domains
Note rough estimates, for propositional reasoning
1M 5M
War Gaming
10301,020
0.5M 1M
VLSI Verification
10150,500
Case complexity
100K 450K
Military Logistics
106020
20K 100K
Chess (20 steps deep)
103010
No. of atoms on the earth
10K 50K
Deep space mission control
Seconds until heat death of sun
1047
100 200
1030
Car repair diagnosis
Protein folding Calculation (petaflop-year)
Variables
100
10K
20K
100K
1M
Rules (Constraints)
Credit Kumar, DARPA Cited in Computer World
magazine
16
Progress in Last 15 Years

Focus Combinatorial Search Spaces
Specifically, the Boolean satisfiability problem,
SAT
Significant progress since the 1990s.
How much?
Problem size We went from 100 variables, 200
constraints (early 90s) to 1,000,000 vars. and
5,000,000 constraints in 15 years.Search space
from 1015 to 10300,000.Aside one can
encode quite a bit in 1M variables.
Tools 50 competitive SAT solvers available
Overview of the state of the art Plenary talk
at IJCAI-05 (Selman) Discrete App. Math. article
(Kautz-Selman 06)

17
How Large are the Problems?
A bounded model checking problem
18
SAT Encoding

(automatically generated from problem
specification)
i.e., ((not x1) or x7) ((not x1) or x6)
etc.
x1, x2, x3, etc. are our Boolean variables (to be
set to True or False)
Should x1 be set to False??
19
10 Pages Later

i.e., (x177 or x169 or x161 or x153 x33 or x25
or x17 or x9 or x1 or (not x185)) clauses /
constraints are getting more interesting
Note x1
20
4,000 Pages Later

21
Finally, 15,000 Pages Later

Search space of truth assignments
Current SAT solvers solve this instance in under
30 seconds!
22
SAT Solver Progress
Solvers have continually improved over time
Source Marques-Silva 2002
23
How do SAT Solvers Keep Improving?

From academically interesting to practically
relevant.
We now have regular SAT solver competitions.
(Germany 89, Dimacs 93, China 96, SAT-02,
SAT-03, , SAT-07)
E.g. at SAT-2006 (Seattle, Aug 06)
35 solvers submitted, most of them open source
500 industrial benchmarks
50,000 benchmark instances available on the www
This constant improvement in SAT solvers is the
key to making, e.g.,SAT-based planning very
successful.

24
Current Automated Reasoning Tools

Most-successful fully automated methods based
on Boolean Satisfiability (SAT) / Propositional
Reasoning
Problems modeled as rules / constraints over
Boolean variables
SAT solver used as the inference engine
Applications single-agent search
AI planning
SATPLAN-06, fastest optimal planner ICAPS-06
competition (Kautz Selman 06)
Verification hardware and software
Major groups at Intel, IBM, Microsoft, and
universitiessuch as CMU, Cornell, and
Princeton.SAT has become the dominant
technology.
Many other domains Test pattern generation,
Scheduling,Optimal Control, Protocol Design,
Routers, Multi-agent systems,E-Commerce
(E-auctions and electronic trading agents), etc.

25
Recall General Automated Reasoning
GeneralInferenceEngine
ModelGenerator(Encoder)
Probleminstance
Solution
Domain-specific
Generic
e.g. logistics, chess,planning, scheduling, ...
applicable to all domainswithin range of
modeling language
Research objective Better reasoning and
modeling technology
Impact Faster solutions in several domains
26
Automated Reasoning with SAT

A simple but useful modeling language Boolean
formulas
Corresponding inference engine Satisfiability
or SAT algorithm (e.g. complete search, local
search, message passing)
Numerous applications hardware and software
verification, planning, scheduling, e-commerce,
circuit design, open problems in algebra,

27
Boolean Logic

Defined over Boolean (binary) variables a, b, c,
Each of these can be True (1, T) or False (0, F)
Variables connected together with logic
operators and, or, not (denoted ?)
E.g. ((c ? ?d) ? f) is True iff
either c is True and d is False, or f is True
Fact All other Boolean logic operators can be
expressed with and, or, not
E.g. (a ? b) same as (?a or b)
Boolean formula, e.g. F (a or b) and ?(a
and (b or c))
(Truth) Assignment any setting of the variables
to True or False
Satisfying assignment assignment where the
formula evaluates to True
E.g. F has 3 satisfying assignments
(0,1,0), (0,1,1), (1,0,0)

28
Boolean Logic Example

F (a or b) and ?(a and (b or c))
Note True often written as 1, False as 0
There are 23 8 possible truth assignments to a,
b, c
(a0,b1,c0) representing (aFalse, bTrue,
cFalse)
(a0,b0,c1)

Exactly 3 truth assignments satisfy F
(a0,b1,c0)
(a0,b1,c1)
(a1,b0,c0)

29
Boolean Logic Expressivity

All discrete single-agent search problems can be
cast as a Boolean formula
Variables a, b, c, often represent states of
the system, events, actions, etc.
(more on this later, using Planning as an
example)
Very general encoding language. E.g. can handle
Numbers (k-bit binary representation)
Floating-point numbers
Arithmetic operators like , x, exp(), log()
SAT encodings (generated automatically from high
level languages) routinely used in domains like
planning, scheduling, verification, e-commerce,
network design,

Recall Example
event
Variables X1 email_ received X2 in_
meeting X3 urgent X4 respond_to_email X5
near_deadline X6 postpone X7
air_ticket_info_request X8 travel_ request X9
info_request
state
action

Rules
X1 (not X2) X3 ? X4
X2 ? not X4
X5 ? X3 or X6
4. X7 ? X8
5. X8 ? X9
6. X8 ? X5
7. X6 ? not X9

constraint
30
Boolean Logic Standard Representations

Each problem constraint typically specified as (a
set of) clauses
E.g. (a or b), (c or d or ?f), (?a or c or
d),
Formula in conjunctive normal form, or CNF a
conjunction of clauses
E.g. F (a or b) and ?(a and (b or c))
changes to
FCNF (a or b) and (?a or ?b) and (b
or ?c)
Alternative useful for QBF specify each
constraint as a term (only and, not)
E.g. (a and ?d), (b and ?a and f), (?b and
d and e),
Formula in disjunctive normal form, or DNF a
disjunction of terms
E.g. FDNF (?a and b) or (a and ?b and ?c)

clauses (only or, not)
31
Boolean Satisfiability Testing

The Boolean Satisfiability Problem, or SAT
Given a Boolean formula F,
find a satisfying assignment for F
or prove that no such assignment exists.

A wide range of applications
Relatively easy to test for small formulas (e.g.
with a Truth Table)
However, very quickly becomes hard to solve
Search space grows exponentially with formula
size (more on this next)
SAT technology has been very successful in taming
this exponential blow up!

32
SAT Search Space
All vars free

SAT Problem Find a path to a True leaf node.
For N Boolean variables, the raw search space is
of size 2N
Grows very quickly with N
Brute-force exhaustive search unrealistic without
efficient heuristics, etc.

33
SAT Solution
All vars free

A solution to a SAT problem can be seen as a path
in the search tree that leads to the formula
evaluating to True at the leaf.
Goal Find such a path efficiently out of the
exponentially many paths.
Note this is a 4 variable example. Imagine a
tree for 1,000,000 variables!

34
k-CNF, 3-CNF

k-CNF all clauses have k literals
1-CNF SAT trivial
2-CNF SAT solvable in O(N2) time N num.
of variables
3-CNF SAT NP-complete
4-CNF SAT NP-complete

Note Any Boolean formula can be converted into
CNF. -- with or without extra variables (without
? size increase)
35
Worst-Case Complexity

SAT is an NP-complete problem
Worst-case believed to be exponential(roughly 2N
for N variables)
10,000 problems in CS are NP-complete (e.g.
planning, scheduling, protein folding, reasoning)
P vs. NP --- 1M Clay Prize
However, real-world instances are usually not
pathological and can often be solved very quickly
with the latest technology!
Typical-case complexity provides a moredetailed
understanding and a more positive picture.

36
Exponential Complexity Growth
Planning (single-agent) find the right
sequence of actions
HARD 10 actions, 10! 3 x 106 possible plans
Contingency planning (multi-agent) actions
may or may not produce the desired effect!
REALLY HARD 10 x 92 x 84 x 78 x x 2256
10224 possible contingency
plans!
37
Typical-Case Complexity
A key hardness parameter for k-SAT the ratio
of clauses to variables
Problems that are not critically constrained tend
to be much easier in practicethan the relatively
few critically constrained ones
Mitchell, Selman, and Levesque 92 Kirkpatrick
and Selman Science 94
38
Typical-Case Complexity
SAT solvers continually getting close to tackling
problems in the hardest region!
SP (survey propagation) now handles 1,000,000
variablesvery near the phase transition region
39
Tractable Sub-Structure Can Dominate and
Drastically Reduce Solution Cost!
2p-SAT model mix 2-SAT (tractable) and 3-SAT
(intractable) clauses
40 3-SAT exponential scaling
Median runtime
? 40 3-SAT linear scaling!
Number of variables
(Monasson, Selman et al. Nature 99 Achlioptas
00)
40
How are other NP-complete problems translated
into SAT instances?SAT encoding
41
SAT Encoding Example Planning Domain

Planning Problem ? Propositional CNF formulaby
axiom schemas
Logistics planning think of a number of trucks
and planes that need to transport a bunch of
packages from their origin to their destination
Discrete time, modeled by integers
state predicates indexed by time at which they
hold
E.g. at_location(x,,loc,i), free(x,i1),
route(cityA,cityB,i)
action predicates indexed by time at which
action begins
E.g. fly(cityA,cityB,i), pickup(x,loc,i),
drive_truck(loc1,loc2,i)
each action takes 1 time step
many actions may occur at the same step

42
Encoding Rules

Actions imply preconditions and effects
fly(x,y,i) ? at(x,i) and route(x,y,i)
and at(y,i1)
Conflicting actions cannot occur at same time (A
deletes a precondition of B)
fly(x,y,i) and y?z ? not fly(x,z,i)
If something changes, an action must have caused
it(Explanatory Frame Axioms)
at(x,i) and not at(x,i1) ? ?y .
route(x,y) and fly(x,y,i)
Initial and final states hold
at(NY,0) and ... and at(LA,9) and ...

43
Using SAT Solvers for Planning
Modeling and Solving a Planning Problem
instantiated propositional clauses
instantiate
Problem description inhigh level language
axiom schemas
(manual)
length
mapping
SAT engine(s)
interpret
satisfying model
plan
(fully automatic)
44
Planning Benchmark Complexity

Logistics domain a complex, highly-parallel
transportation domain
E.g. logistics.d problem
2,165 possible actions per time slot
optimal solution contains 74 distinct actions
over 14 time slots
(out of 5 x 1046 possible sequential plans of
length 14)
Satplan Selman et al. approach is currently
fastest optimal planning approach. Winner
ICAPS-05 ICAPS-06 international planning
competitions.

45
Solution Approaches to SAT
46
Solving SAT Systematic Search

One possibility enumerate all truth assignments
one-by-one, test whether any satisfies F
Note testing is easy!
But too many truth assignments (e.g. for N1000
variables, have 21000 ? 10300 truth assignments)
00000000
00000001
00000010
00000011
11111111

2N
47
Solving SAT Systematic Search

Smarter approach the DPLL procedure 1960s
(Davis, Putnam, Logemann, Loveland)
Assign values to variables one at a time
(partial assignments)
Simplify F
If contradiction (i.e. some clause becomes
False), backtrack, flip last unflipped
variables value, and continue search
Extended with many new techniques -- 100s of
research papers, yearly conference on SATe.g.,
extremely efficient data-structures
(representation), randomization, restarts,
learning reasons of failure
Provides proof of unsatisfiability if F is unsat.
complete method
Forms the basis of dozens of very effective SAT
solvers!e.g. minisat, zchaff, relsat, rsat,
(open source, available on the www)

48
Solving SAT Local Search

Search space all 2N truth assignments for F
Goal starting from an initial truth assignment
A0, compute assignments A1, A2, , As such that
As is a satisfying assignment for F
Ai1 is computed by a local transformation to
Aie.g. A1 000110111 green bit flips to
red bit A2 001110111 A3
001110101 A4 101110101
As 111010000 solution found!
No proof of unsatisfiability if F is unsat.
incomplete method
Several SAT solvers based on this approach, e.g.
Walksat

49
Solving SAT Decimation

Search space all 2N truth assignments for F
Goal attempt to construct a solution in
one-shot by very carefully setting one variable
at a time
Survey Inspired Decimation
Estimate certain marginal probabilities of each
variable being True, False, or undecided in
each solution cluster using Survey Propagation
Fix the variable that is the most biased to its
preferred value
Simplify F and repeat
A method rarely used by computer scientists
But has received tremendous success from the
physics community on random k-SAT can easily
solve random instances with 1M variables!
No searching for solution
No proof of unsatisfiability incomplete method

50
The Next Two Lectures

Problems beyond SAT / searching for a single
solution
P-complete count the number of solutions of a
SAT instance
P-hard sample a solution uniformly at random
for a SAT instance
PSPACE-complete quantified Boolean formula (QBF)

51
Thank you for attending!
Slides http//www.cs.cornell.edu/sabhar/tutoria
ls/kitpc08-combinatorial-problems-I.ppt Ashish
Sabharwal http//www.cs.cornell.edu/sabhar Bart
Selman http//www.cs.cornell.edu/selman

Write a Comment

User Comments (0)

About PowerShow.com

Combinatorial Problems I: Finding Solutions - PowerPoint PPT Presentation

Combinatorial Problems I: Finding Solutions

4,000 Pages Later: 21. Current SAT solvers solve this instance in. under 30 seconds! ... All vars free. Fix one variable to True or False. Fix another var. Fix ... – PowerPoint PPT presentation