Title: Satisfiability modulo the Theory of Bit Vectors
1Satisfiability modulothe Theory of Bit Vectors
- Alessandro Cimatti
- IRST, Trento, Italy
- cimatti_at_irst.itc.it
Joint work with R. Bruttomesso, A. Franzen, A.
Griggio, R. Sebastiani
We gratefully acknowledge support from the
Academic Research Program of Intel
2Index of the talk
- Satisfiability Modulo Theory
- The theory of Bit Vectors
- Satisfiability Modulo BV
- Bit blasting
- Eager encoding into Linear Integer Arithmetic
- A lazy approach
- Conclusions
- ( A preview of QF_UFBV32 at SMT-COMP )
3SMT in a nutshell
- Satisfiability Modulo Theory
- or beyond boolean SAT
- Decide the satisfiability of a first order
formula with respect to a background theory - Examples of relevant theories
- uninterpreted functions xy f(x) ! f(y)
- difference logic x y lt 7
- linear arithmetic 3x 2y lt 12
- arrays read(write(M, a0, v0) a1)
- their combinations
- bit vectors
4Why SMT
- From SAT-based to SMT-based verification
- Representation of interesting problems
- timed automata
- hybrid automata
- pipelines
- software
- Efficient solving
- leverage availability of structural information
- hopefully retaining efficiency of boolean SAT
5Satisfiability Modulo Theory
- Satisfiability
- is there a truth-assignment to boolean variables
- and a valuation to individual variables
- such that formula evaluates to true?
- Standard semantics for FOL
- Assignment to individual variables
- Induces truth values to atoms
- Truth assignment to boolean atoms
- Induced value to whole formula
6Propositionalstructure
-
-
-
-
-
-
TA
TA
TA
TA
P P P
x y z w x
x y z w x
7Two Main Approaches to SMT
- the eager approach
- the lazy approach
- theory independent view
- theory specific view
8Eager Approach to SMT
- Main idea compilation to SAT
- STEP1 Theory part compiled to equisatisfiable
pure SAT problem - STEP2 run propositional SAT solver
9(No Transcript)
10Lifted theory
Propositionalstructure
P P P
TA TA TA TA
11The Lazy approach
- Ingredients
- a boolean SAT solver
- a theory solver
- The boolean solver is modified to enumerate
boolean (partial) models - The theory solver is used to Check for theory
consistency
12Propositionalstructure
TA
TA
TA
TA
P P P
TA TA TA TA
x y z w x
x y z w x
13MathSAT intuitions
- Two ingredients boolean search and theory
reasoning - find boolean model
- theory atoms treated as boolean atoms
- truth values to boolean and theory atoms
- model propositionally satisfies the formula
- check consistency wrt theory
- set of constraints induced by truth values to
theory atoms - existence of values to theory variables
- Example (P v (x 3)) (Q v (x y lt 1)) (y lt
2) (P xor Q) - Boolean model
- !P, (x 3), Q, (x y lt 1), (y lt 2)
- Check (x 3), (x y lt 1), (y lt 2)
- Theory contradiction!
- Another boolean model
- P , !(x 3) , !Q, (x y lt 1), (y lt 2)
- Check !(x 3), (x y lt 1), (y lt 2)
- Consistent e.g. x 0, y 0
14Boolean SAT search space
P
Q
Q
R
S
S
T
S
T
R
R
?
?
?
T SAT!
?
?
- The DPLL procedure
- Incremental construction of satisfying assignment
- Backtrack/backjump on conflict
- Learn reason for conflict
- Splitting heuristics
15MathSAT approach
- DPLL-based enumeration of boolean models
- Retain all propositional optimizations
- Conflict-directed backjumping, learning
- No overhead if no theory reasoning
- Tight integration between
- boolean reasoning and
- theory reasoning
16MathSAT search space
P
Q
Q
R
S
S
T
S
T
R
R
Bool ?
Bool T Math ?
Bool ?
Bool T Math T SAT!
Bool T Math ?
Bool ?
- Many boolean models are not theory consistent!
17Early pruning
- Check theory consistency of partial assignments
P
EPMath ?
EPMath T
Q
EPMath T
S
Pruned away in the EP step
EPMath T
T
EPMath T
R
Bool ?
Bool T Math T SAT!
18THEORY OF FIXED-WIDTH BIT VECTORS
19Bit Vectors Example
input a, b, c, d regN
- LTmp0 a
- LTmp1 2 b
- LTmp2 LTmp0 LTmp1
- LTmp3 4 c
- LTmp4 LTmp2 LTmp3
- LTmp5 8 d
- LOut LTmp4 LTmp5
- Are they equivalent?
- ((a 2b) 4c) 8d
- RTmp0 d
- RTmp1 RTmp0 ltlt 1
- RTmp2 c RTmp1
- RTmp3 RTmp2 ltlt 1
- RTmp4 b RTmp3
- RTmp5 RTmp4 ltlt 1
- ROut a RTmp5
- a ((b ((c (dltlt1)) ltlt1)) ltlt1)
I.e. LOut ROut ?
20Fixed Width Bit Vectors
- Constants
- 0b00001111, 0xFFFF,
- Variables
- valued over BitVectors of corresponding width
- implicit restriction to finite domain
- Function symbols
- selection x150
- concatenation y z
- bitwise operators x y, z w,
- arithmetic operators x y, z w,
- shifting x ltlt 2, y gtgt 3
- Predicate symbols
- comparators , ? , gt , lt , ,
21Fragments of BV theory
- Core
- selection
- concatenation
- Bitwise operators
- x y, x y, x y
- Arithmetic operators
- x y, x y, c x
- Core Bitwise Arithmetic
- Complexity of equality between BV terms
- Core is in P
- Core B A in NP
- Variable width bit vectors not covered here
- core is in NP
- small additions yield undecidability
22Decision procedures for BV
- Many approaches
- Cyrluk, Moeller, Ruess
- Moeller, Ruess
- Bjørner, Pichora
- Barrett, Dill, Levitt
- Focus on deciding conjunctions of literals
- Emphasis on proof obligations in ITP
- some emphasis on variable width, generic wrt N
- Shostak-style integration
- canonization
- solving
23SATISFIABILITY MODULO THEORY OF BIT VECTORS
24Satisfiability modulo Bit Vectors
- Applications of interest
- RTL hardware descriptions essentially bit vectors
- assembly-level programs
- software with finite precision arithmetic
- Key feature
- combination of control flow and data flow
- In principle, boolean logic can be encoded into
BV - control (boolean logic) encoded into width 1 BVs.
- Likely inefficient in comparison to SAT
- More natural to keep them separate at modeling
- structural info can be exploited for verification
25Approaches to SMT(BV)
- Bit blasting
- Eager Encoding into LA
- Lazy approach
26SMT(BV) via Bit Blasting
27SMT(BV) via Bit Blasting
- Boolean variables untouched
- Bit vector variables as collections of
(unrelated) boolean variables - x0, x1, , x63
- Selection/concatenations are trivial
- static detection
- Equalities / Assignments x y
- (x0 lt-gt y0) (x1 lt-gt y1) (x63 lt-gt y63)
- Bitwise operators x y
- x0 y0, x1 y1, , x63 y63
- Arithmetic operators x y
- BVADD(x0, , x63, y0, , y63)
28Comparison of Data Paths
input a, b, c, d regN
- LTmp0 a
- LTmp1 2 b
- LTmp2 LTmp0 LTmp1
- LTmp3 4 c
- LTmp4 LTmp2 LTmp3
- LTmp5 8 d
- LOut LTmp4 LTmp5
- Are they equivalent?
- ((a 2b) 4c) 8d
- RTmp0 d
- RTmp1 RTmp0 ltlt 1
- RTmp2 c RTmp1
- RTmp3 RTmp2 ltlt 1
- RTmp4 b RTmp3
- RTmp5 RTmp4 ltlt 1
- ROut a RTmp5
- a ((b ((c (dltlt1)) ltlt1)) ltlt1)
I.e. LOut ROut ?
29Bit Blasting Words
- a,b,c,d,
- blasted to a1,aN, b1,bN, c1,cN,
d1,dN, - LTmp6 ! RTmp6
- (LOut.1 ! ROut.1) or or (LOut.N ! ROut.N)
- LTmp1 2 b
- formula in 2N vars, conjunction of N iffs
- LTmp2 LTmp0 LTmp1
- formula relating 3N vars
- possibly additional vars required (e.g. carries)
- N 16 bits?
- 13 secs
- N 32 bits?
- 170 secs
- But obviously N 64 bits!
- stopped after 2h CPU time
Scalabilitywith respect to N???
30Bit-Blasting Pros and Conses
- Bottlenecks
- dependency on word width
- wrong level of abstraction
- boolean synthesis of arithmetic circuits
- assignments are pervasive
- conflicts are very fine grained
- e.g. discover x lt y
- Advantages
- let the SAT solver do all the work
- and nowadays SAT solvers are tough nuts to crack
- amalgamation of the decision process
- no distinction between control and data
- conflicts can be as fine grained as possible
- built-in capability to generate new atoms
31Enhancements to BitBlasting
- Tuning SAT solver on structural information
- e.g. splitting heuristic for adders
- Preprocessing SAT GBD05
- rewrite and normalize bit vector terms
- bit blasting to SAT
32SMT(BV) via reduction to SMT(LA)
33From BV to LIA
- RTL-Datapath Verification using Integer Linear
Programming BD01 - BV constants as integers
- 0b32_1111 as 15
- BV variables as integer valued variables, with
range constraints - reg x 310 as x in range 0, 232)
- Assignments treated as equality, e.g. x y
- Arithmetic, e.g. z x y
- Linear arithmetic? not quite! BV Arithmetic is
modulo 2N - z x y - 2N s, with z in 0, 2N)
- Concatenation x y as 2n x y
- Selection relational encoding (based on
integrity) - x2316 as xm, where
- x 224 xh 216 xm xl, xl in 0, 216), xm
in 0, 28), xl in 0, 28) - Bitwise operators
- based on selection of individual bits
- SOLVER
- the omega test
34From SMT(BV) into SMT(LIA)
- Generalizes BD01 to deal with boolean structure
- Eager encoding into SMT(LIA)
- Unfortunately, not very efficient
- More precisely, a failure
35Retrospective Analysis
- Crazy approach?
- Arithmetic
- Linear arithmetic? not quite! BV Arithmetic is
modulo 2N - Selection and Concatenation
- an easy problem becomes expensive!
- Bitwise operators
- HARD!!!
- Available solvers not adequate
- integers with infinite precision
- reasoning with integers may be hard (e.g. BnB
within real relaxation) - Functional dependencies are lost!
- A clear culprit static encoding
- depending on control flow, same signal is split
in different parts - z if P then x70 y30 else x52
y103 - x, y and also z are split more than needed
- the notion of maximal chunk depends on P !!!
36SMT(BV) via online BV reasoning
37A lazy approach
- Based on standard MathSAT schema
- DPLL-based model enumeation
- Dedicated Solver for Bit vectors
- The encoding leverages information resulting from
decisions - Given values to control variables, the data path
is easier to deal with (e.g. maximal chunks are
bigger) - Layering in the theory solver
- equality reasoning
- limited simplification rules
- full blown bit vector solver only at the end
38The architecture
Boolean enumeration
BV solver
EUF reasoning
LIAencoding
BV rewriter
39Rewriting rules
- evaluation of constant terms
- 0b8_0101010142 becomes 0b3_101
- rules for equality
- x y and Phi(x) becomes Phi(y)
- based on congruence closure
- splitting concatenations
- (x y) z becomes x zh_n y zl_n
40Rewriting rules
- pushing selections
- (x y)70 becomes (x70 y70)
- (x y)238 becomes (x70 y158)
- pigeon-hole rules
- from (x ! 0 x ! 1 x ! 2 x lt 3) derive
false
41BV rewriter
- Rules are applied until
- fix point reached
- contradiction found
- Implementation based on EUF reasoner
- rules as merges between eq classes
- Open issues
- incrementality/backtrackability
- selective rule activation
- conflic set reconstruction
- When it fails
42LIA encoding (the last hope)
- LIA encoding
- idenfication of maximal slices
- purification separating out arithmetic and BW
by introduction of additional variables - NB on resulting problems
- LIA encoding always superior to bit blasting!!!
- cfr DB01
43Status of Implementation
- Implementation still in prototypical state
- Does a lot of stupid things
- conflict minimization by deletion filtering
- checking that conflict are in fact minimal
- unnecessary calls to LA for SAT clusters
- calling LA solver implemented as dump on file,
and run external MathSAT - huge conflict sets
44A very very preliminary evaluation
45Competitors
- Run against MiniSAT 1.14
- winner of SAT competition in 2005
- KEY REMARK
- boolean methods are very mature
- A good reason for giving up?
46Test benches
- 74 benchmarks from industrial partner
- would have been ideal for SMT-COMP
- QF_UFBV32
- Unfortunately
- can not be disclosed
- will have to be destroyed after the
collaboration - hopefully our lives will be spared ?
47(No Transcript)
48(No Transcript)
49Conclusions
- A market need for SMT(BV) solvers
- Bit Blasting tough competitors
- After a failure,
- Preliminary results are encouraging
- Future challenges
- optimize BV solver
- better conflict sets
- tackle some RTL verification cases
- extension to memories
50A small digression on QF_UFBV32 at SMT-COMP
51QF_UFBV32 at SMT-COMP
- the MathSAT you will see there IS NOT the one I
described - We currently have no results for QF_UFBV
- Easy benchmarks
- QF_UFBV32 not particularly SMT
- the boolean component is nearly missing
- the BV part is easily solvable by bit blasting
- We entered SMT-COMP QF_UFBV32
- MathSAT based on BIT BLASTING to SAT
- NuSMV based on bit blasting to BDDs
52QF_UFBV Bit Blasting to SAT
- Preprocessing based on
- Ackermans elimination of function symbols
- rewriting simplification
- bit blasting
- Core call SAT solver underlying MathSAT
- every SAT problem in lt 0.3 secs
- most UNSAT within seconds
- a handful of hard ones between 300 and 500 secs
53BDDs (???) on SMT-COMP tests
- Even NuSMV entered SMT-COMP
- Ackermans elimination of functional symbols
- Rewriting preprocessor
- Core solver
- based on BDDs
- conjunctively partitioned problem
- structural BDD-based ordering (bit interleaving)
- (almost) no dynamic reordering
- affinity-based clustering, threshold 100 nodes
- early quantification
- Seems to work well both on SAT and UNSAT instances
54RESULTS
- first STP
- then YICES
- then NuSMV
- then CVC3 (but no results on two samples)
- then MathSAT BITBLASTING
- 3rd on SAT
- last on UNSAT
55SAT instances
56UNSAT instances