Title: Static Analysis of Heap-manipulating Low-level software
1Static Analysis of Heap-manipulating Low-level
software
- Sumit Gulwani Ashish Tiwari
- MSR, Redmond SRI International
2Related Work
- Alias/Pointer Analysis Work done in early 90s
- Must/May equalities
- Considered not expressive enough
- Shape Analysis Work that followed
- Fancy predicates
- Need to provide transfer functions for each of
them - This work
- Must/May equalities extended with quantifiers
(Provides expressiveness of an infinite class of
predicates and avoids the need of providing
transfer functions)
3Example 1
struct List int Len, Data List Next
ListOfPtrArray(struct List x) for (y
x y?null y y!next) t ? y!len t
y!data malloc(4t) for (y x y?null y
y!next) for (z 0 z lt y!len z
z1) y!data!(4z) .
Invariant required after first loop for proving
memory safety
9i List(x,i,next) Æ 8j (0jlti) )
Array(x!nextj!data, 4(x!nextj!len))
4Example 2
Prog. Point Invariant
?1 9 i List(x,i,next)
?2 9 i List(x,i,next), 0nlti, yx!nextn
?3 List(x,n,next)
?4 List(x,n.next), Array(A,4n)
?5 List(x,n.next), Array(A,4n) 8 j (0jltn) ) A!(4j) x!nextj!data
struct List int Data List Next
List2Array(struct List x) n 0
for (y x y?null y y!next) n n1
A malloc(4n) y x for (k 0 k lt
n k) A!(4k) y!data y
y!next return A
?1
?2
?3
?4
?5
5Outline
- Abstract Domain
- Implies Algorithm
- Join Algorithm
- Meet Algorithm
- PostAssignment Algorithm
6Abstract Domain
- 9V Cons Æ Must Æ May
- Must true Must Æ 8V (Cons ) e1e2)
- May true May Æ 8V (Cons ) e1 e2)
- e y c e1 e2 ce e1 ! e2e3 valid
null - Cons represent constraints over the base abstract
domain, eg. Combination of linear arithmetic and
uninterpreted functions
7Expressiveness
- List(x,i,next) i 0 Æ x!nexti null Æ
- 8 j (0jlti) )
Valid(x!nextj) - Valid(e) e!wvalid
- Array(x,k) 8 j (0jltk) ) Valid(xj)
8Abstract Interpreter
F
F2
F1
F
p
Statement s
False
True
F
F
F1
F2
Conditional Node
Assignment Node
Join Node
F Join(F1,F2)
F Post(F,s) Where s may be x e x e x
malloc(e) free(x)
F1 Meet(F, p) F2 Meet(F,p)
9Implies Algorithm
- Implies(F1, F2) returns 1 only if F1 ) F2
- KeyIdea for checking F ) e1e2
- Check if e2 2 MustAliases(e1,F)
- KeyIdea for checking F ) e1 ? e2
- Check if (e2 2 MayAliases(e1,F))
10MustAliases and MayAliases
- F1 x x!nextj
- F2 8i (0ij) ) x!nexti x!nexti1!prev
- MustAliases
- KeyIdea Apply k quantifier instantiations
- MustAliases(x,F1) x!nextj, x!next2j
- MustAliases(x,F2)
11MustAliases and MayAliases
- F1 x x!nextj
- F2 8j (0ij) ) x!nextix!nexti1!prev
- MustAliases
- KeyIdea Apply k quantifier instantiations
- MustAliases(x,F1) x!nextj, x!next2j
- MustAliases(x,F2) x!next!prev,
-
x!next!prev!next!prev
12MustAliases and MayAliases
- F1 x x!nextj
- F2 8j (0ij) ) x!nextix!nexti1!prev
- MustAliases
- KeyIdea Apply k instantiations of each equality
- MustAliases(x,F1) x!nextj, x!next2j
- MustAliases(x,F2) x!next!prev,
-
x!next!prev!next!prev - MayAliases
- KeyIdea Represent aliases by expressions of size
k - MayAliases(x,F1) x!nextt tj
- MayAliases(x,F2)
13MustAliases and MayAliases
- F1 x x!nextj
- F2 8j (0ij) ) x!nextix!nexti1!prev
- MustAliases
- KeyIdea Apply k instantiations of each equality
- MustAliases(x,F1) x!nextj, x!next2j
- MustAliases(x,F2) x!next!prev,
-
x!next!prev!next!prev - MayAliases
- KeyIdea Represent aliases by expressions of size
k - MayAliases(x,F1) x!nextt tj
- MayAliases(x,F2) x!(nextprev)t t0
14Join Algorithm
- Join(F1, F2) returns an overapproximation of F1 Ç
F2 - Example 1
- Input 1 i1 Æ A00
- Input 2 i2 Æ A00 Æ A11
- Output 8 j (0jlti) ) Ajj
- Example 2
- Let S(k) Array(x!nextk!data, x!nextk!len)
- Input 1 yx!next Æ S(0)
- Input 2 yx!next2 Æ S(0) Æ S(1)
- Output 9i 1i2 Æ yx!nexti Æ 8j (0jlti) ) S(j)
15Join Algorithm Key Idea
- Input 1 yx!next Æ S(0)
- Input 2 yx!next2 Æ S(0) Æ S(1)
- After Normalization, we get
- Input 1 9i i1 Æ yx!nexti Æ 8j (0jlt1) ) S(j)
- Input 2 9i i2 Æ yx!nexti Æ 8j (0jlt2) ) S(j)
- Now we use the following rule
- Join (9V E1 Æ 8U C1)S, 9V E2 Æ 8U C2)S)
- 9V E3 Æ 8U C3)S
- where E3 Join(E1, E2)
- C3 Underapproximation of C1ÆC2
16Join Algorithm Key Idea
- Input 1 yx!next Æ S(0)
- Input 2 yx!next2 Æ S(0) Æ S(1)
- After Normalization, we get
- Input 1 9i i1 Æ yx!nexti Æ 8j (0jlt1) ) S(j)
- Input 2 9i i2 Æ yx!nexti Æ 8j (0jlt2) ) S(j)
- Now we use the following rule
- Join (9V E1 Æ 8U C1)S, 9V E2 Æ 8U C2)S)
- 9V E3 Æ 8U C3)S
- where E3 Join(E1, E2)
- C3 Underapproximation of (E1)C1 Æ
E2)C2)
17Meet Algorithm
- Meet(F,p) returns an overapproximation of F Æ p
- KeyIdea Reason about interaction between
equalities disequalities - Example 1
- Input 1 9 i leni Æ List(x,i,next) Æ
yx!nextlen - Input 2 ynull
- Output 9 i leni Æ List(x,i,next) Æ yx!nextlen
- Example 2
- Input 1 9 i leni Æ List(x,i,next) Æ
yx!nextlen - Input 2 y?null
- Output 9 i lenlti Æ List(x,i,next) Æ yx!nextlen
18PostAssignment Algorithm
Post(F, s) returns an overapproximation of the
strongest postcondition of F w.r.t. s KeyIdea
Transitive Closure Invalidate Must Invalidate
May Add new fact
result
y
null
null
tmp
- Input 1 List(y,i,next) Æ List(result,j,next) Æ
- ynextx Æ xtmp
- Input 2 x result
- Output List(tmp,i-1,next) Æ List(result,j,next)
Æ - ynextx Æ xresult
19Experiments
Program Base Constraint Domain Required Property Discovered (apart from memory safety) Precondition provided
List2Array Generalized Difference Constraints Corresponding array list elements are same Input is a list
ListReverse Generalized Difference Constraints Reversed list has length 100 Input is a list of size 100
ArrayPtrArray Generalized Difference Constraints Uninterpreted Functions Input array has length Len (where Len is also an input)
20Related Work
- Alias/Pointer Analysis Work done in early 90s
- Must/May equalities
- Considered not expressive enough
- Shape Analysis Work that followed
- Fancy predicates
- Need to provide transfer functions for each of
them - This work
- Must/May equalities extended with quantifiers
(Provides expressiveness of an infinite class of
predicates and avoids the need of providing
transfer functions)
21Conclusion and Future Work
- Quantified abstract domain for pointer analysis
- Expressive enough to reason rich properties
- Amenable to automated deduction
- Extend analysis to inter-procedural setting
- Add disjunction and richer quantification support
in abstract domain