Title: Structural Invariants
1Structural Invariants
- Ranjit Jhala Rupak Majumdar Ru-Gang Xu
2Generating Invariants
- Verification Conditions (VC)
- Generic Appicable to any user specified
assertion - Precise Capture all path correlations
- Manual Intervention Requires Annotations
- Dataflow Analysis and Abstract Interpretation
- Specialized Uses a fixed abstraction
- Imprecise Merges paths
- Automatic No user intervention
3Structure Invariants
- Lightweight VC Generation Technique
- Generic Prove a wide range of safety properties
- Precise Does not capture all path correlations,
but capture structural idioms - Automatic Simple approximations of loop
invariants - Scalable Leverages well-optimized compiler
techniques
4Plan
- 1. Preliminaries
- 2. Dominator Invariants
- 3. f-Strengthening
- 4. Extensions
- 5. Experiments
5Example
- Conditional locking on a predicate p
0 lock 0 1 if (p) 2 assert(lock
0) 3 lock 1 4 5 . . .
- Control Flow Information Stmt 0 dominates Stmt
2 All paths going to stmt 2 must go through stmt
0 - Data Flow information Between stmt 0 and stmt 2,
lock does not get modified the value of lock
after stmt 0 is same as the value of lock at stmt
2
6Two Compiler Algorithms
- Dominator Tree captures control flow information
- For two stmts n, n we say n dominates n, n ?
D(n) if for every path to n goes through n - n is a immediate dominator of n, n Idom(n)
iff for every dominator of n is also a dominator
of n - A dominator tree is a tree whose nodes are
statements where each parent immediate dominates
its children - Static Single Assignment captures dataflow
information - Each variable is syntactically assigned once
- f-assignments deal with joins
- x f (x1, x2 . . . xn)
7Example
0 lock 0 1 if (p) 2 assert(lock
0) 3 lock1 4 5 . . .
n0
n1,true
n5
n1,false
SSA Form
n2
0 locko 0 1 if (p) 2 assert(lock0
0) 3 lock11 4 5 lock2 f(lock0,
lock1) 6. . .
n3
Dominator Tree
8Dominator Invariants
n0
lock0 0
n1,true
n5
n1,false
p
p
lock2 f(lock0, lock1)
n2
assert(lock0 0)
n3
lock1
n0 ? n1,true (lock0 0) ? p
(lock0 0)
gt
9Dominator Invariants
- Theorem
- For a node n, DInv(n) n ? (?ndominates
nn)is an - n-invariant
- After executing a node n, n holds
- If ndominates n then along every path to n, then
there is a point where n holds - After the last occurrence of n, the only nodes
visited are those that are dominated by n - None of the variables in n are modified
10Dominator Invariants are Insufficient
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n3
n8
n9
(lock0 0) ? lock2 f(lock0, lock1) ? p
gt
(lock2 1)
11f-Strengthening
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n3
n8
n9
(lock0 0) ? lock2 f(lock0, lock1) ? p
gt
(lock2 1)
12f-Strengthening
entry
CFG
n
n
Idom(n)
n x3 f(x1, x2)
n
((x3 x1) ? DInv(Idom(n),n) ? (x3 x2) ?
DInv(Idom(n),n)) ? SI (entry, Idom(n))
n
n
Dominator Tree
132-SI is Sufficient
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n8
n3
n9
p ? (lock0 0) ? (lock2 f(lock0, lock1)) gt
(lock2 1)
142-SI is Sufficient
n0
00 locko 0 01 if (p) 02 assert(lock0
0) 03 lock11 04 05 lock2
f(lock0, lock1) 06 . . . 07 if (p) 08
assert(lock2 1) 09 lock3 0 10
11 lock4 f(lock2, lock3)
n5
n1,true
n1,false
n7,true
n7,false
n11
n2
n8
n3
n9
p ? (lock0 0) ? (((lock2 lock0)?p) ? ((lock2
lock1)?p?(lock11)))) gt (lock2 1)
15k-Structural Invariants (k-SI)
Idom(n)
- k-SI unfolds the nesting structure of the program
- k is the branch-width senstivity of the analysis
k
k-1
n
k-1
k
k-1
k-2
k-1
k-2
k-2
k
n
n
Dominator Tree
16k-Structural Invariants (k-SI)
Idom(n)
(n?n) Y((n,n,k) n ? (?n?D(n)?D(n)
n ?G(n,k))
k
k-1
n
Dealing with f-nodes
k-1
k
k-1
k-2
G(n,k)) ?nj?pred(n)(F(n,j) ?
Y(Idom(n),nj,k-1))
k-1
k-2
k-2
k
n
n
Dominator Tree
17Plan
- 1. Preliminaries
- 2. Dominator Invariants
- 3. f-Strengthening
- 4. Extensions
- Interprocedural k-SI
- Pointers in k-SI
- 5. Experiments
18Interprocedural k-SI Callees
For assertions within a function g that calls f
- f is called, we define l f(e1, e2, . . . en)
as - Recursively construct the k-SI for the exit node
of f - Rename all local variables of f
- Subsitute formals with actuals
- Subsitute the return value
- (?L. Y(nfe,nfx,k)) l/ret, e1/x1, e2/x2 . . .
en/xn
If recursive, l f(e1, e2, . . . en) is true
19Interprocedural k-SI Callers
For assertions within a function f that is called
by g
- f has callers, we generalize dominators by adding
edges from every call site x f(. . .) to the
entry node of function f. - If n dominates n, then every path from the entry
node of main to n passes through n - The algorithm k-SI for transitive callers is the
same as the intraprocedural algorithm
20k-SI with Pointers
p q 5
Points to Analysis q -gt a,b p -gt (c,d)
if (q a) tmp a if (q b)
tmp b if (p c) c tmp 5 if (p
d) d tmp 5
- Run may-points-to analysis
- Substitute dereferences with the possible memory
being pointed to - Run k-SI
21Limitations
Dealing with loops if n is a fl-node then
G(n,k)) true
x0 1 while() L x1 fl(x0,x3) if (x1
1) x2 1 x3 f(x1,x2) x4
f(x0,x3) assert(x4 1)
k-SI will lose the value of x at L, making x
unconstrained at the assert.
Dataflow analysis that tracks (x1) will prove
the assertion.
However, only 13 false positive out of 653 total
asserts were due to this limitation.
22Implementation
- psi an assertion checker for C programs using
structural invariants - CIL Library
- Flow-insensitive May Alias Analysis
- Simplify Theorem Prover
23Experiments
Tagged Unions a predicate must hold when a field
is accessed
assert (ip -gt proto TCP) TCP tcphdr (TCP)
ip-gth
Locks lock / unlock in strict alternation
Priviledge Levels permissions set before
syscalls For suid programs, setuid or seteuid is
called before before system or exev
24Experiments
Tagged Unions a predicate must hold when a field
is accessed
Locks lock / unlock in strict alternation
Priviledge Levels permissions set before syscalls
25Experiments
Scalable k-SI runs at least a magitude faster
than complex tools such as BLAST Effective For
simple properties, k-SI has similar number of
false postives (FP) as BLAST
BLAST
2-SI
26Experiments
- Precision Tradeoffs
- Path sensitivity 2-SI captures the relevent
structural idioms - Past k2, FP does not decrease. Complex control
flow is rare and usually irrelevent
Although 2-SI is simple, 2-SI is sufficient.
27Summary
- k-SI is a scalable, lightweight algorithm for
finding invariant that prove useful properties of
programs
- Transform to SSA
- Find dominators
- Handle f-nodes as disjunction
- Depth of that disjunction is a tunable parameter
- Use an automatic theorem prover to check whether
the assertion holds.
28Questions?
29Extra slides
30Example Conditional Locking
n0
Y(n0,n8,2) n8?n7,true?n5 ? G(n5,2)
?n0 G(n5,2) (F(n5,1) ? Y(n0,
n1,false,1)) ? (F(n5,2) ? Y(n0, n3,1))
n5
n1,true
n1,false
n2
n7,true
n7,false
n11
n3
n8
n9
Y(n0,n8,2) p ? (((lock2 lock0)?p) ? ((lock2
lock1)?p))) ? (lock0 0)
31Abstract Summarization
An abstract summary S of a f is a subset of P x
P such that the execution of f in a start
satisfying p ends in a state satisfying p, we
have (p,p) ?S
Using summaries Replace calls l f(e1, e2, .
. . en) with (?L. ?(p,p) ? S(p ? p). l/ret,
e1/x1, e2/x2 . . . en/xn
Making summaries (p,p) ? P x P p ?
Y(nfe,nfx,k) ? p is satisfiable
32Abstract Summarization
- If ?P and ?P are not both equivalent to true, we
add - assertion (?L. ?P ). l/ret, e1/x1, e2/x2 . .
.en/xn in front of each call l f(e1, e2, . .
. en) for checking preconditions - assertion ?P at the exit nodes of functions to
check the postcondition
33f-Strengthening
0 locko 0 1 if (p) 2 assert(lock0
0) 3 lock11 4 5 lock2 f(lock0,
lock1) 6. . .
Recursively compute the dominator invariant of
each predecessor of a f-node ((lock2lock1) ?
DInv(n3)) ? ((lock2lock0) ? DInv(n1,false))
34f-Strengthening
0locko 0
1,true p
2assert(lock0 0) 3lock1 1
1,false p
5lock2 f(lock0, lock1)
DInv(n3) (lock1 1) ? p ? (lock0 1)
DInv(n1,false) p ? (lock0 1)
n0 is the immediate dominator of n5 so it
dominates both n3 and n1,false
DInv(n,n) is the conjunction of dominators
between n and n
35Scalability Abstract Summarization
An abstract summary of a function is a relation
on input predicates P and output predicates P
Using summaries Replace calls with summaries.
Instead of using the k-SI of the function, each
call is replaced with a more concise summary.
Making summaries (p,p) ? P x P p ?
Y(nfe,nfx,k) ? p is satisfiable
If ?P and ?P are not both equivalent to true, we
add assertions to check preconditions and
postconditions.