Title: CS 4700: Foundations of Artificial Intelligence
1CS 4700Foundations of Artificial Intelligence
- Carla P. Gomes
- gomes_at_cs.cornell.edu
- Module
- Satisfiability
- (Reading RN Chapter 7)
2Proof methods
- Proof methods divide into (roughly) two kinds
- Application of inference rules
- Legitimate (sound) generation of new sentences
from old - Proof a sequence of inference rule
applications Can use inference rules as
operators in a standard search algorithm - Different types of proofs
- Model checking
- truth table enumeration (always exponential in n)
- improved backtracking, e.g., Davis--Putnam-Logeman
n-Loveland (DPLL) (including some inference rules)
- heuristic search in model space (sound but
incomplete) - e.g., min-conflicts-like hill-climbing
algorithms
Previous module
3Satisfiability
4Propositional Satisfiability problem
- Satifiability (SAT) Given a formula in
propositional calculus, is there a model - (i.e., a satisfying interpretation, an assignment
to its variables) making it true? - We consider clausal form, e.g.
- ( a ? ?b ? ? c ) AND ( b ? ? c) AND ( a
? c)
possible assignments
SAT prototypical hard combinatorial search and
reasoning problem. Problem is NP-Complete. (Cook
1971) Surprising power of SAT for encoding
computational problems.
5Satisfiability as an Encoding Language
6Encoding Latin Square Problems in Propositional
Logic
- Variables
- Each variables represents a color assigned to a
cell. - Clauses
- Some color must be assigned to each cell
- No color is repeated in the same row
- No color is repeated in the same column
(clause of length n) n2
(sets of negative binary clauses) n(n-1)/2 n n
(example Row i Color k)
(sets of negative binary clauses) n(n-1)/2 n n
(example Colum j Color k)
73D Encoding or Full Encoding
- This encoding is based on the cubic
representation of the quasigroup each line of
the cube contains exactly one true variable - Variables
- Same as 2D encoding.
- Clauses
- Same as the 2 D encoding plus
- Each color must appear at least once in each row
- Each color must appear at least once in each
column - No two colors are assigned to the same cell
8Dimacs format
- At the top of the file is a simple header.
- p cnf ltvariablesgt ltclausesgt
- Each variable should be assigned an integer
index. Start at 1, as 0 is used to indicate the
end of a clause. The positive integer a positive
literal, whereas a negative interger represents a
negative literal. - Example
- -1 7 0 ? (? x1 ? x7)
9Extended Latin Square 2x2
order 2 -1 -1 -1 -1
- p cnf 8 24
- -1 -2 0
- -3 -4 0
- -5 -6 0
- -7 -8 0
- -1 -5 0
- -2 -6 0
- -3 -7 0
- -4 -8 0
- -1 -3 0
- -2 -4 0
- -5 -7 0
- -6 -8 0
- 1 2 0
- 3 4 0
- 5 6 0
- 7 8 0
- 1 5 0
- 2 6 0
1/2 3/4 5/6 7/8
1 cell 11 is red 2 cell 11 is green 3 cell
12 is red 4 cell 12 is green 5 cell 21 is
red 6 cell 21 is green 7 cell 22 is red 8
cell 22 is green
10Significant progress in Satisfiability Methods
Software and hardware verification complete
methods are critical - e.g. for verifying the
correctness of chip design, using SAT encodings
Applications Hardware and Software
Verification Planning, Protocol Design, etc.
Going from 50 variable, 200 constraints to
1,000,000 variables and 5,000,000 constraints
in the last 10 years
Current methods can verify automatically the
correctness of gt 1/7 of a Pentium IV.
11Model Checking
12Turing Award
Source Slashdot
13A real world example
14Bounded Model Checking instance
i.e. ((not x1) or x7) and ((not x1) or
x6) and etc.
1510 pages later
(x177 or x169 or x161 or x153
or x17 or x9 or x1 or (not x185)) clauses /
constraints are getting more interesting
164000 pages later
!!! a 59-cnf clause
17Finally, 15,000 pages later
Note that
!!!
MiniSAT solver solves this instance in less than
one minute.
18Effective propositional inference
19Effective propositional inference
- Two families of algorithms for propositional
inference (checking satisfiability) based on
model checking (which are quite effective in
practice) - Complete backtracking search algorithms
- DPLL algorithm (Davis, Putnam, Logemann,
Loveland)
- Incomplete local search algorithms
- WalkSAT algorithm
20The DPLL algorithm
- Determine if an input propositional logic
sentence (in CNF) is satisfiable.
- Improvements over truth table enumeration
- Early termination
- A clause is true if any of its literals is true.
- A sentence is false if any clause is false.
- Pure symbol heuristic
- Pure symbol always appears with the same "sign"
in all clauses. - e.g., In the three clauses (A ? ?B), (?B ? ?C),
(C ? A), A and B are pure, C is impure. - Make a pure symbol literal true.
- Unit clause heuristic
- Unit clause only one literal in the clause
- The only literal in a unit clause must be true.
21The DPLL algorithm
22DPLL
- Basic algorithm for state-of-the-art SAT
solvers - Several enhancements
-
- - data structures
- - clause learning
- - randomization and restarts
Check http//www.satlive.org/
23Learning in Sat
24The WalkSAT algorithm
- Incomplete, local search algorithm
- Evaluation function The min-conflict heuristic
of minimizing the number of unsatisfied clauses
- Balance between greediness and randomness
25The WalkSAT algorithm
26Lots of solvers and information about SAT,
theory and practice
http//www.satlive.org/
27Computational Complexity of SAT
How does an algorithm scale?
Analyzable Realistic Spectrum of hardness
28SAT Complexity
- NP-Complete - worst-case complexity
- (2n possible assignments)
- Average Case Complexity (I)
- Constant Probability Model Goldberg 79
Goldberg et al 82 - N variables L clauses
- p - fixed probability of a variable in a clause
(literals 0.5 /-) - (i.e., average clause length is pN)
- Eliminate empty and unit clauses
- Empirically, on average, SAT can be easily
solved - O(n2)
Key problem easy distribution random guesses
find a solution in a constant number of tries
Franco 86 Franco and Ho 88
29Hard satisfiability problems
- Consider random 3-CNF sentences. e.g.,
- (?D ? ?B ? C) ? (B ? ?A ? ?C) ? (?C ? ?B ? E) ?
(E ? ?D ? B) ? (B ? E ? ?C)
- m number of clauses
- n number of symbols
30SAT Complexity
- Average Case Complexity (II)
- Fixed-clause Length Model Random K-SAT Franco
86 - N variables L clauses K number of literals per
clause - Randomly choose a set of K variables per clause
(literals 0.5 /-) - Expected time O(2n)
Can we provide a finer characterization beyond
worst-case results?
31Typical-Case Complexity
- Typical-case complexity a more detailed picture
- Characterization of the spectrum of hardness of
instances as we vary certain interesting
instance parameters - e.g. for SAT clause-to-variable ratio.
- Are some regimes easier than others?
- What about a majority of the instances?
32Typical Case Analysis3 SATAll clauses have 3
literals
Median Runtime
Selman et al. 92,96
33Hard problems seem to cluster near m/n 4.3
(critical point)
34Intuition
- At low ratios
- few clauses (constraints)
- many assignments
- easily found
- At high ratios
- many clauses
- inconsistencies easily detected
35(No Transcript)