Title: Software Model Checking: predicate abstraction
1Software Model Checking predicate abstraction
- Thomas Ball
- Testing, Verification and Measurement
- Microsoft Research
2Today
- Formalizing SLAM
- Predicate abstraction of programs with procedures
and pointers - Symbolic model checking of boolean programs
3Reachability
unsafe
unsafe
init
4Safe Forward Invariants
- ? is a safe forward invariant if
- init ? ?
- F(?) ? ?
- ? ? safe
5Abstraction Overapproximation of Behavior
Concrete State Space
Abstract State Space
xlt10
x1 or x2
xlt5
6Abstraction Overapproximation of Behavior
?
unsafe
?
F(?)
init
7Counterexample-driven Refinement
E loop F predAbs(F,E)
if unsafe ? lfp(F, init) then return SUCCESS
else find min k s.t. unsafe ? F k
(init) if unsafe ? F k (init)
then return FAILURE else
find E s.t. unsafe ? G k (init) where G
predAbs(F,E) E E ? E forever
8C-
Types ? void bool int ref ?
Expressions e c x e1 op e2 x
x LExpression l x
x Declaration d ? x1,x2
,,xn Statements s skip goto L1,L2
Ln L s
assume(e)
l e l
f (e1 ,e2 ,,en )
return x
s1 s2 sn Procedures p
? f (x1 ?1,x2 ?2,,xn ?n
) Program g d1 d2 dn p1
p2 pn
9C--
Types ? void bool
int Expressions e c x e1 op
e2 LExpression l
x Declaration d ? x1,x2
,,xn Statements s skip goto L1,L2
Ln L s
assume(e)
l e f (e1
,e2 ,,en ) return
s1 s2
sn Procedures p f (x1
?1,x2 ?2,,xn ?n ) Program g
d1 d2 dn p1 p2 pn
10BP
Types ? void bool Expressions e
c x e1 op e2 LExpression l
x Declaration d ? x1,x2
,,xn Statements s skip goto L1,L2
Ln L s
assume(e)
l e f (e1
,e2 ,,en ) return
s1 s2
sn Procedures p f (x1
?1,x2 ?2,,xn ?n ) Program g
d1 d2 dn p1 p2 pn
11What is Hard?
- Abstracting
- from a language with pointers (C)
- to one without pointers (boolean programs)
- All side effects need to be modeled by copying
(as in dataflow)
12What stayed fixed?
- Boolean program model
- Basic tool flow
- Repercussions
- model side-effects by value-result
- finite depth precision on the heap is all boolean
programs can do
13Syntactic sugar
goto L1, L2 L1 assume(e) S1
goto L3 L2 assume(!e) S2 goto
L3 L3 S3
if (e) S1 else S2 S3
14Example, in C
- void cmp (int a , int b)
- if (a b)
- g 0
- else
- g 1
-
int g main(int x, int y) cmp(x, y) if
(!g) if (x ! y) assert(0)
15Example, in C--
- void cmp(int a , int b)
- goto L1, L2
-
- L1 assume(ab)
- g 0
- return
- L2 assume(a!b)
- g 1
- return
-
int g main(int x, int y) cmp(x, y)
assume(!g) assume(x ! y) assert(0)
16c2bp Predicate Abstraction for C Programs
- Given
- P a C program
- F e1,...,en
- each ei a pure boolean expression
- each ei represents set of states for which ei is
true - Produce a boolean program B(P,F)
- same control-flow structure as P
- boolean vars b1,...,bn to match e1,...,en
- properties true of B(P,F) are true of P
17Assumptions
- Given
- P a C program
- F e1,...,en
- each ei a pure boolean expression
- each ei represents set of states for which ei is
true - Assume each ei uses either
- only globals (global predicate)
- local variables from some procedure (local
predicate for that procedure) - Mixed predicates
- predicates using both local variables and global
variables - complicate return processing
- covered in advanced topics
18C2bp Algorithm
- Performs modular abstraction
- abstracts each procedure in isolation
- Within each procedure, abstracts each statement
in isolation - no control-flow analysis
- no need for loop invariants
19- void cmp (int a , int b)
- goto L1, L2
-
- L1 assume(ab)
- g 0
- return
- L2 assume(a!b)
- g 1
- return
-
int g main(int x, int y) cmp(x, y)
assume(!g) assume(x ! y) assert(0)
Preds xy g0
ab
20- void cmp (int a , int b)
- goto L1, L2
-
- L1 assume(ab)
- g 0
- return
- L2 assume(a!b)
- g 1
- return
-
int g main(int x, int y) cmp(x, y)
assume(!g) assume(x ! y) assert(0)
void cmp ( ab )
decl g0 main( xy )
Preds xy g0
ab
21- void cmp (int a , int b)
- goto L1, L2
-
- L1 assume(ab)
- g 0
- return
- L2 assume(a!b)
- g 1
- return
-
int g main(int x, int y) cmp(x, y)
assume(!g) assume(x ! y) assert(0)
void cmp ( ab ) goto L1, L2 L1
assume( ab ) g0 T
return L2 assume( !ab )
g0 F return
decl g0 main( xy ) cmp( xy
) assume( g0 ) assume( !xy )
assert(0)
Preds xy g0
ab
22 C--
Types ? void bool
int Expressions e c x e1 op
e2 LExpression l x
Declaration d ? x1,x2
,,xn Statements s skip goto L1,L2
Ln L s
assume(e)
l e f (e1
,e2 ,,en )
return
s1 s2 sn Procedures
p f (x1 ?1,x2
?2,,xn ?n ) Program g
d1 d2 dn p1 p2 pn
23Abstracting Assigns via WP
- Statement yy1 and F ylt4, ylt5
- ylt4, ylt5 ((!ylt5 !ylt4) ? F ),
ylt4 - WP(xe,Q) Qx -gt e
- WP(yy1, ylt5)
- (ylt5) y -gt y1
- (y1lt5)
- (ylt4)
24WP Problem
- WP(s, ei) not always expressible via e1,...,en
- Example
- F x0, x1, xlt5
- WP( xx1 , xlt5 ) xlt4
- Best possible x0 x1
25Abstracting Expressions via F
- F e1,...,en
- ImpliesF(e)
- best boolean function over F that implies e
- ImpliedByF(e)
- best boolean function over F implied by e
- ImpliedByF(e) !ImpliesF(!e)
26ImpliesF(e) and ImpliedByF(e)
27Computing ImpliesF(e)
- minterm m d1 ... dn
- where di ei or di !ei
- ImpliesF(e)
- disjunction of all minterms that imply e
- Naïve approach
- generate all 2n possible minterms
- for each minterm m, use decision procedure to
check validity of each implication m?e - Many optimizations possible
28Abstracting Assignments
- if ImpliesF(WP(s, ei)) is true before s then
- ei is true after s
- if ImpliesF(WP(s, !ei)) is true before s then
- ei is false after s
- ei ImpliesF(WP(s, ei)) ? true
- ImpliesF(WP(s, !ei)) ? false
-
29Assignment Example
Statement in P Predicates in E y y1
xy
Weakest Precondition WP(yy1, xy) xy1
ImpliesF( xy1 ) false ImpliesF( x!y1
) xy
Abstraction of assignment in B xy xy
? false
30Abstracting Assumes
- WP( assume(e) , Q) e?Q
- assume(e) is abstracted to
- assume( ImpliedByF(e) )
- Example
- F x2, xlt5
- assume(x lt 2) is abstracted to
- assume( xlt5 !x2 )
31Abstracting Procedures
- Each predicate in F is annotated as being either
global or local to a particular procedure - Procedures abstracted in two passes
- a signature is produced for each procedure in
isolation - procedure calls are abstracted given the callees
signatures
32Abstracting a procedure call
- Procedure call
- a sequence of assignments from actuals to formals
- see assignment abstraction
- Procedure return
- NOP for C-- with assumption that all predicates
mention either only globals or only locals - with pointers and with mixed predicates
- Most complicated part of c2bp
- Covered in the advanced topics section
33- void cmp (int a , int b)
- Goto L1, L2
-
- L1 assume(ab)
- g 0
- return
- L2 assume(a!b)
- g 1
- return
-
int g main(int x, int y) cmp(x, y)
assume(!g) assume(x ! y) assert(0)
void cmp ( ab ) Goto L1, L2 L1
assume( ab ) g0 T
return L2 assume( !ab )
g0 F return
decl g0 main( xy ) cmp( xy
) assume( g0 ) assume( !xy )
assert(0)
xy g0 ab
34Precision
- For program P and E e1,...,en, there exist
two ideal abstractions - Boolean(P,E) most precise abstraction
- Cartesian(P,E) less precise abtraction, where
each boolean variable is updated independently - See Ball-Podelski-Rajamani, TACAS 00
- Theory
- with an ideal theorem prover, c2bp can compute
Cartesian(P,E) - Practice
- c2bp computes a less precise abstraction than
Cartesian(P,E) - we use Das/Dills technique to incrementally
improve precision - with an ideal theorem prover, the combination
of c2bp Das/Dill can compute Boolean(P,E)
35C-
Types ? void bool int ref ?
Expressions e c x e1 op e2 x
x LExpression l x
x Declaration d ? x1,x2
,,xn Statements s skip goto L1,L2
Ln L s
assume(e)
l e l
f (e1 ,e2 ,,en )
return x
s1 s2 sn Procedures p
? f (x1 ?1,x2 ?2,,xn ?n
) Program g d1 d2 dn p1
p2 pn
36Two Problems
- Extending SLAM tools for pointers
- Dealing with imprecision of alias analysis
37Pointers and SLAM
- With pointers, C supports call by reference
- Strictly speaking, C supports only call by value
- With pointers and the address-of operator, one
can simulate call-by-reference - Boolean programs support only call-by-value-result
- SLAM mimics call-by-reference with
call-by-value-result - Extra complications
- address operator () in C
- multiple levels of pointer dereference in C
38Assignments Pointers
Statement in P Predicates in E p 3
x5
Weakest Precondition WP( p3 , x5 ) x5
What if p and x alias?
Correct Weakest Precondition (px and 35)
or (p!x and x5)
We use Dass pointer analysis PLDI 2000 to
prune disjuncts representing infeasible alias
scenarios.
39Abstracting Procedure Return
- Need to account for
- lhs of procedure call
- mixed predicates
- side-effects of procedure
- Boolean programs support only call-by-value-result
- C2bp models all side-effects using return
processing
40Abstracting Procedure Returns
- Let a be an actual at call-site P()
- pre(a) the value of a before transition to P
- Let f be a formal of a procedure P
- pre(f) the value of f upon entry to P
41predicate
call/return relation
call/return assign
int R (int f) int r f1 f 0 return
r
Q() int x 1 x R(x)
f x
fpre(f) rpre(f)1
x1 x2
pre(f) pre(x)
x r
WP(fx, fpre(f) ) xpre(f)
xpre(f) is true at the call to R
WP(xr, x2) r2
pre(f)pre(x) and pre(x)1 and rpre(f)1
implies r2
x1
s
Q() x1,x2 T,F
bool R ( fpre(f) ) rpre(f)1
fpre(f) fpre(f) return
rpre(f)1
s R(T)
x2 s x1
42predicate
call/return relation
call/return assign
int R (int f) int r f1 f 0 return
r
Q() int x 1 x R(x)
f x
fpre(f) rpre(f)1
x1 x2
pre(f) pre(x)
x r
WP(fx, fpre(f) ) xpre(f)
xpre(f) is true at the call to R
WP(xr, x2) r2
pre(f)pre(x) and pre(x)1 and rpre(f)1
implies r2
x1
s
Q() x1,x2 T,F
bool R ( fpre(f) ) rpre(f)1
fpre(f) fpre(f) return
rpre(f)1
s R(T)
x1, x2 , s x1
43Extending Pre-states
- Suppose formal parameter is a pointer
- eg. P(int f)
- pre( f )
- value of f upon entry to P
- cant change during P
- pre( f )
- value of dereference of pre( f )
- can change during P
44predicate
call/return relation
call/return assign
apre(a) pre(a)pre(a)
Q() int x 1 R(x)
int R (int a) a a1
x1 x2
a x
pre(a) x
pre(a)pre(a)1
pre(x)1 and pre(a)pre(x) and
pre(a)pre(a)1 and pre(a)x implies x2
x1
s
Q() x1,x2 T,F
bool R ( apre(a), pre(a)pre(a) )
pre(a)pre(a)1 pre(a)pre(a)
return pre(a)pre(a)1
s R(T,T)
x2 s x1
45Reachability in Boolean Programs
- Algorithm based on CFL reachability
- Sharir-Pnueli 81 Reps-Sagiv-Horwitz 95
- path edge of procedure P
- ltentry,d1gt ? lts2,d2gt
- if Ps entry is reachable in state d1 then
statement s2 of P is reachable in state d2 - summary edge of procedure P
- ltcall Q,d1gt ? ltret,d2gt
- if P calls Q from state d1 then Q returns to P
in state d2
46Symbolic CFL reachability
- Partition path edges by their target
- PE(v) ltd1,d2gt ltentry,d1gt ? ltv,d2gt
- What is ltd1,d2gt for boolean programs?
- A bit-vector!
- What is PE(v)?
- A set of bit-vectors
- Use a BDD (attached to v) to represent PE(v)
47Binary Decision Diagrams
void cmp ( e2 ) 5Goto L1, L2 6L1
assume( e2 ) 7 gz T goto L3 8L2
assume( !e2 ) 9gz F goto L3 10 L3
return
- Canonical representation of
- boolean functions
- set of (fixed-length) bitvectors
- binary relations over finite domains
- Efficient algorithms for common dataflow
operations - transfer function
- join/meet
- subsumption test
48decl gz main( e ) 1 equal( e ) 2
assume( gz ) 3 assume( !e ) 4
assert(F)
gzgz ee
ee gze
ee gz1 e1
e2e
void cmp ( e2 ) 5Goto L1, L2 6L1
assume( e2 ) 7 gz T goto L3
8L2 assume( !e2 ) 9gz F goto L3
10 L3 return
gzgz e2e2
gzgz e2e2 e2T
e2e2 e2T gzT
gzgz e2e2 e2F
e2e2 e2F gzF
e2e2 gze2
49Reachability Summary
- Explicit representation of CFG
- Implicit representation of path edges and summary
edges - Generation of hierarchical error traces
- Complexity O(E 2O(N))
- E is the size of the CFG
- N is the max. number of variables in scope