Title: Flow Analysis: Model Checking and Dataflow Analysis
1Flow AnalysisModel CheckingandDataflow
Analysis
2Model Checking
- Model checking, narrowly interpreted
- Decision procedures for checking if a given
Kripke structure is a model for a given formula
of a modal logic.
3Why is this of interest?
- Because the dynamics of a discrete system is
effectively captured as a discrete event system - Because many useful dynamic properties of systems
are captured by modal logics - Thus, model checking ) System Verification
4- Model checking, generally interpreted
- Algorithms, rather than proof calculi, for
system verification which operate on a system
model (semantics), rather than a system
description (syntax).
5- There are many different model checking
algorithms, depending on - The system model
- The specification formalism
6A Specific Model Checking Problem
Specification System properties
Implementation System model
models implements refines Satisfaction
relation
7A Specific Model Checking Problem
Less detailed
More detailed
Specification System properties
Implementation System model
models implements refines Satisfaction
relation
8Characteristics of system models which favor
model checking over other verification techniques
ongoing input/output behavior
(not single input, single result) concurrency
(not single control flow) control
intensive (not lots of data
manipulation)
9Examples
- Control logic of hardware designs
- Communication protocols
- Device drivers
10Paradigmatic Example Mutual Exclusion
loop out x1 1 last 1 req await
x2 0 or last 2 in x1 0 end loop.
loop out x2 1 last 2 req await
x1 0 or last 1 in x2 0 end loop.
P2
P1
11Model-checking problem
I S
system model
system property
satisfaction relation
12Model-checking problem
I S
system model
system property
satisfaction relation
13Various factors influence choice of model -
State based vs event based - Concurrency
model While the choice of system model is
important for ease of modeling in a given
situation, the only thing that is important for
model checking is that the system model can be
translated into some form of state-transition
graph.
14q1
a
a,b
b
q3
q2
15Semantics State-Transition Graph
- Q set of states q1,q2,q3
- A set of atomic observations a,b
- ? Q ? Q transition relation q1
? q2 - Q ? 2A observation function
q1 a
set of observations
16Important Restriction
- Until notified, restrict attention to
- finite-state transition systems
- Q is finite
17Syntax Finite State Programs
- Parallel composition of C programs, without
function calls - Each variable has a finite range
- Well write such programs as guarded commands
18Mutual-exclusion protocol
loop out x1 1 last 1 req await
x2 0 or last 2 in x1 0 end loop.
loop out x2 1 last 2 req await
x1 0 or last 1 in x2 0 end loop.
P2
P1
19oo001
or012
ro101
io101
rr112
pc1 o,r,i pc2 o,r,i x1 0,1 x2 0,1
last 1,2
ir112
3?3?2?2?2 72 states
20State Explosion Problem
The translation from a system description to a
state-transition graph usually involves an
exponential blow-up !!!
e.g., n boolean variables ? 2n states
21Model-checking problem
I S
system model
system property
satisfaction relation
22Three important decisions when choosing system
properties
- operational vs. declarative automata
vs. logic - may vs. must branching vs. linear
time - prohibiting bad vs. desiring good behavior
safety vs. liveness
The three decisions are orthogonal, and they lead
to substantially different model-checking
problems.
23Safety vs. liveness
Safety something bad will never
happen Liveness something good will happen
(but we dont know when)
24Safety vs. liveness for sequential programs
induction on control flow
Safety the program will never produce a
wrong result (partial
correctness) Liveness the program will produce
a result (termination)
well-founded induction on data
25Safety vs. liveness for state-transition graphs
Safety those properties whose violation always
has a finite witness (if
something bad happens on an infinite run, then
it happens already on some finite prefix)
Liveness those properties whose violation never
has a finite witness
(no matter what happens along a finite run,
something good could still happen later)
26q1
a
a,b
b
q3
q2
Run q1 ? q3 ? q1 ? q3 ? q1 ? q2 ? q2
? Trace a ? b ? a ? b ? a ? a,b ? a,b
?
27State-transition graph S ( Q, A, ?, )
Finite runs finRuns(S) ? Q Infinite runs
infRuns(S) ? Q? Finite traces finTraces(S) ?
(2A) Infinite traces infTraces(S) ? (2A)?
28Safety the properties that can be
checked on finRuns Liveness the properties
that cannot be checked on finRuns
29This is much easier.
Safety the properties that can be
checked on finRuns Liveness the properties
that cannot be checked on finRuns
(they need to be checked on
infRuns)
30Example Mutual exclusion
It cannot happen that both processes are in their
critical sections simultaneously.
31Example Mutual exclusion
It cannot happen that both processes are in their
critical sections simultaneously.
Safety
32Example Bounded overtaking
Whenever process P1 wants to enter the critical
section, then process P2 gets to enter at most
once before process P1 gets to enter.
33Example Bounded overtaking
Whenever process P1 wants to enter the critical
section, then process P2 gets to enter at most
once before process P1 gets to enter.
Safety
34Example Starvation freedom
Whenever process P1 wants to enter the critical
section, provided process P2 never stays in the
critical section forever, P1 gets to enter
eventually.
35Example Starvation freedom
Whenever process P1 wants to enter the critical
section, provided process P2 never stays in the
critical section forever, P1 gets to enter
eventually.
Liveness
36q1
a
a,b
b
q3
q2
infRuns ? finRuns
37q1
a
a,b
b
q3
q2
infRuns ? finRuns
? closure
finite branching
38For state-transition graphs, all
properties are safety properties !
39Two remarks
The vast majority of properties to be verified
are safety.
While nobody will ever observe the violation of a
true liveness property, fairness is a useful
abstraction that turns complicated safety into
simple liveness.
40Safety Model Checking
- Requirement The system should always stay within
some safe region - Input A state transition graph
- Input A set of good states invariants
- Output Safe if all executions maintain the
invariant, Unsafe otherwise (and a trace)
41Mutual-exclusion protocol
loop out x1 1 last 1 req await
x2 0 or last 2 in x1 0 end loop.
loop out x2 1 last 2 req await
x1 0 or last 1 in x2 0 end loop.
P2
P1
42Example Mutual exclusion
It cannot happen that both processes are in their
critical sections simultaneously. (pc1 in Æ
pc2 in)
43From Safety to Reachability
- Input A state transition graph
- Input A set of bad states
- Output Safe if there is no run from an initial
state to any bad state, Unsafe otherwise (and a
trace)
44Model Checking Algorithm
- Graph Search
- Linear time in the size of the graph
- Exponential time in the size of the program
45Enumerative Model Checking
- Provide access to each state
- For each state, provide access to neighboring
states - Implement classical graph algorithms
46State Space Explosion
- Biggest problem is state space explosion
- Many heuristics
- Search on-the-fly
- Do not store dead variables
- Bitstate hashing (unsound, but useful debugging
tool) - Spin, CMC, Zing
47Symbolic Model Checking
- Idea Represent sets of states symbolically,
using constraints - E.g., 1 x 100 represents the 100 states x 1,
x 2, , x 100 - Represent both sets of initial states and
transition relation implicitly
48Datatype symreg
- symreg 2 2Q
- constant EmptySet
- with EmptySet
- , Ã… symreg symreg ! symreg
- , µ symreg symreg ! bool
49Symbolic Transition Graph
- A transition graph
- A symreg data structure
- Operations
- Init symgraph ! symreg
- Post symreg symgraph ! symreg
- Pre symreg symgraph ! symreg
50Symbolic Search
- Input symgraph G, region ?T
- Output Answer to reachability problem (G, ?T)
-
- begin
- ?R Init(G)
- repeat forever
- if ?R Ã… ?T ? EmptySet then return yes
- if Post(?R,G)µ ?R then return no
- ?R ?R Post(?R,G)
- end
51Symbolic Search
- Guaranteed to terminate for finite state systems
- Computes a fixpoint of reachable states
- How can we implement symreg?
52Predicates
- How about representing sets of states using
formulas? - Sets of states Formula over X
- Transition relation Formula over X and X
- Boolean operations are easy
- Can compute post
- Post(S) 9 X.S(X)Æ T(X,X)
53Mutual-exclusion protocol
loop out x1 1 last 1 req await
x2 0 or last 2 in x1 0 end loop.
loop out x2 1 last 2 req await
x1 0 or last 1 in x2 0 end loop.
P2
P1
54Not so nice
- Checking equality/implication of formulas
expensive - More importantly, the size of Posti(I) can grow
with i, with no good heuristics for
simplifications
55Additional Desirable Properties
- All operations must be efficient in practice
- Should maintain compactness whenever possible
- Canonical representations
- Representing initial states and transition
relation from the program description should be
efficient
56Binary Decision Diagrams
- Efficient representations of boolean functions
- Share commonalities
- Ordered BDDs
- Fix a linear ordering of the variables in X
- BDD DAG, with nodes labeled with boolean
variables - Each variable occurs 0 or 1 times along a path
- Paths in the DAG encode assignments to variables
57More on BDDs
- An OBDD is obtained by applying the following two
transformations - Identify and merge isomorphic subgraphs
- Eliminate internal vertices with identical left
and right children (no redundancy)
58Properties
- Given an ordering of variables, every function
has a unique OBDD representation (canonicity) - Isomporphism Equivalence
- V(B) is valid iff B is the 1 BDD
- Ordering influences size
59Operations on BDDs
- Boolean operations, existential quantification
can be done efficiently on BDDs - BDDs provide a good symbolic representation for
finite state spaces
60Safety Properties
- Not all safety properties can be written as
invariants on the program state space - For example, if correctness depends on the order
of events - Locks can be acquired and released in
alternation, it is an error to acquire/release a
lock twice in succession without an intermediate
release / acquire
61Monitors
- Write the ordering of events as an automaton
(called the monitor) - Take the product of the system with the monitor
- The monitor tracks the sequence of events
- It goes to a special bad state if a bad
sequence occurs - Now we can express the property as an invariant
the monitor state is never bad
62Infinite State
- Unfortunately, programs are not finite state
- Variables range over (formally) infinite domains
- Functions have recursion
- Can dynamically create data/processes
63Transition System Semantics
- Can construct an infinite state transition system
from a program - States The state of the program
- (variable state, memory, CFA location)
- Transitions q! q iff in the operational
semantics, there is a transition of the program
from q to q - Initial state Initial state of the program
64How do we extend to infinite state?
- Generalize symbolic model checking!
- What operators did we require?
- Empty region
- Boolean operations Ç, Æ, )
- Emptiness check
- Pre
65Symbolic Data Structures
- Why not use our assertion language as symbolic
data structures? - Empty region false
- Boolean operations (syntactic)
- boolean
operations - Emptiness Decision procedure for
satisfiability - Pre WP
- What is the problem with this representation?
66Termination
- Each operation can be computed
- But iterating Pre or Post operations may not
terminate - We have come back to the same problem as before
loop invariants helped us get around infinite
iterations - What do we do now?
67Before we proceed
- What is the sign of the following product
- - 12433454628 94329545771 ?
68Lesson
- For a particular property, the exact state need
not be tracked - One can abstract the trace, and yet reason
about the program - Abstraction
- -ve ve -ve
69Lecture
70Model Checking Algorithm
- Graph Search
- Linear time in the size of the graph
- Exponential time in the size of the program
71Abstract Interpretation
- The state transition graph is infinite
- Suppose we put a finite grid on top
72Existential Abstraction
- Every time s ! s, we have s ! s
- This allows more behaviors
73Abstract Model Checking
- Search the abstract graph until fixpoint
74Simulation Relations
- A relation ¹ µ Q Q is a simulation relation if
s¹ s implies - Observation(s) Observation(s)
- For all t such that s! t
- there exists t such that s! t
- and s ¹ t
- Formally captures notion of more behaviors
- Implies trace containment and
- containment of reachable states
75Main Theorem
- s ¹ s is a simulation relation
- If an error is unreachable in Abs(G) then it is
unreachable in G - Plan
- Find a suitable grid to make the graph finite
state - Run the finite-state model checking algorithm on
this abstract graph - If abstract graph is safe, say safe and stop
76What if the Abstract Graph says Unsafe?
- The error may or may not be reachable in the
actual system - Stop and say Dont know
77What if the Abstract Graph says Unsafe?
- Or, put a finer grid on the state space
- And try again
- The set of abstract reachable states is smaller
- Where do these grids come from?
78Grids Predicate Abstraction
- Suppose we fix a set of predicates on program
variables - E.g., old new, lock 0, lock 1
- Grid Two states of the program are equivalent if
they agree on the values of all predicates - N predicates 2N abstract states
- How do we compute the grid from the program?
79Predicate Abstraction
Region Representation formulas over predicates
Set of states
Abstract Set P1P2P4 Ç P1 P2 P3 P4
80Predicate Abstraction
- Box abstract variable valuation
- BoxCover(S) Set of boxes covering S
- Theorem prover used to compute BoxCover
81Post, Pre
post
post(S)
post(S)
- pre(S,op) s 9s2S. s !op s (Weakest
Precondition) - post(S,op) s 9s2S. s !op s (Strongest
Postcondition) - Abstract Operators post
- post(S,op) µ post(S,op)
-
- Concrete Operators pre
- Classical Weakest Precondition
82Computing Post
post
post(S)
post(S)
- For each predicate p, check if
- S) Pre(p, op) then have a conjunct p
- S) Pre( p, op) then have a conjunct p
- Else have no conjunct corresponding to p
- Use a theorem prover for these queries
83Example
- I have predicates
- lock0, newold, lock1
- My current region is lock 0 Æ new old
- Consider the assignment new new1
- What is abstract post?
84Example
- WP(newnew1, lock0) is lock0
- WP(newnew1, lock1) is lock1
- WP(newnew1, newold) is new1old
- lock0Æ newold ) lock 0 YES
- lock0Æ newold ) lock ? 0 NO
- lock0Æ newold ) lock 1 NO
- lock0Æ newold ) lock ? 1 YES
- lock0Æ newold ) new1old NO
- lock0Æ newold) new1? old YES
- So post is lock 0 Æ lock? 1 Æ new? old
85Symbolic Search with Predicates
- Symreg Boolean formulas of (fixed set of)
predicates - Boolean operations easy
- Emptiness check Decision procedures
- Post The abstract post computation algorithm
- Can now implement symbolic reachability search!
86Symbolic Search
- Terminates because the state space is finite
- Where did loop invariants go?
- Loop invariants are synthesized in the
reachability process - Loop invariants are boolean combinations of the
abstraction predicates
87Example
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
Q Is Error Reachable ?
88ExampleCFG
lock() old new
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
89ExampleCFG
Example ( ) 1 do lock()
old new 2 if () 3
unlock() new
4 while ( new ! old) 5
unlock () return
Q Is Error Reachable ?
90Example
- Fix predicates
- lock0, lock1, newold
- Assume that lock 0 at the beginning
- Behavior of lock()
- If lock0 then lock1 else error
- Behavior of unlock()
- If lock1 then lock0 else error
91Symbolic Search
Set of predicates LOCK0, LOCK1, new old
LOCK0 Æ new old
92Big Question
- Who gives us these predicates?
- Answer 1 The user
- Manual abstractions
- Dataflow analysis
93Lattice Lingo
- A lattice is a set S, together with binary
operators Ç and Æ - Elements Top and Bottom
- Idea Each lattice element denotes some set of
program states - Can frame abstract reachability based analysis
for any lattice - Provided we have a transfer function
- ? Lattice Command ! Lattice
- This is what we have been doing in model checking
94Abstract Interpretation
- Lattice based analysis is usually called abstract
interpretation or dataflow analysis - Formalizes and unifies a lot of program analyses
- In the context of model checking, it is called
abstract model checking
95Some formalism
- Define functions
- ? Program State ! Lattice (abstraction)
- ? Lattice ! Set of program states
(concretization) - Simulation relation
- s ¹ ?(s)
- s ! t in the program ) ? (s) ! ?(t) in the
lattice world - That is, ?(t) 2 ? (?(s))
- This ensures the analysis is sound
- Once the lattice, abstraction function, and
transfer function is defined, flow analysis is
computing reachability on the lattice by
iterating the transfer function
96Approximate Analysis
- Many program dataflow analyses do not really
compute exact reachability analysis with the
lattice - Exact reachability Path sensitive analysis
- Use the structure of the control flow graph to
approximate the result - Get an over approximation of the set of reachable
program states
97Example Flow Sensitive Analysis
- For each control flow node, keep track of the set
of reachable states (along any program path) to
that node - Information may be lost at merge points
- Assumption All paths of the control flow graph
can be executed - Ignore conditional statements
98Example Constant Propagation
- Constant Lattice
- T
-
- -3 -2 -1 0 1 2 3
-
- ?
- Dataflow lattice Map each program variable to an
element of the constant lattice
99Transfer Function
- ? (x n, ?) ?xn
- Dataflow analysis
- For each CFG node, keep a lattice.
- Initially, lattice is T (denoting we have no
information) - At the node n, we take
- ?(n) Æ Æ ?(?(n), c(n,n))
- Over all predecessors n of n
100Constant Propagation
- Approximate! Loses information
- (1) in merges
- (2) in disregarding conditionals
- But faster than model checking
- Examples?
101Back to Locking Example
- Let the predicates Lock0 and Lock1 denote the
information lattice - Show that flow sensitive dataflow analysis cannot
prove that the program is correct
102Flow Insensitive Analysis
- Even more approximate
- Keep one lattice element for the entire program
- Effectively disregard the order of commands in
the program! - Much faster analysis than flow sensitive
- But results are much cruder of course!
103- When I run a model checker, it goes to compute
the result and never comes back. When I run a
dataflow analysis, it comes back immediately and
says Dont know!