Shape Analysis via 3-Valued Logic

About This Presentation

Title:

Shape Analysis via 3-Valued Logic

Description:

Title: Program Analysis via Graph Reachability Author: Thomas Reps Last modified by: sagiv Created Date: 3/24/1998 3:26:02 AM Document presentation format – PowerPoint PPT presentation

Number of Views:176

Avg rating:3.0/5.0

Slides: 94

Provided by: thomas399

Category:

more less

Transcript and Presenter's Notes

Title: Shape Analysis via 3-Valued Logic

1
Shape Analysisvia 3-Valued Logic

Mooly Sagiv
Tel Aviv University

http//www.cs.tau.ac.il/msagiv/toplas02.ps www.cs
.tau.ac.il/tvla
2
Topics

A new abstract domain for static analysis
Abstract dynamically allocated memory
TVLA A system for generating abstract
interpreters
Applications

3
Motivation

Dynamically allocated storage and pointers are
essential programming tools
Object oriented
Modularity
Data structure
But
Error prone
Inefficient
Static analysis can be very useful here

4
A Pathological C Program
a malloc() b a free (a) c malloc
() if (b c) printf(unexpected equality)
5
Dereference of NULL pointers

typedef struct element
int value
struct element next
Elements

bool search(int value, Elements c) Elements
elemfor (elem c c ! NULL
elem elem-gtnext) if (elem-gtval
value) return TRUE return FALSE
6
Dereference of NULL pointers

typedef struct element
int value
struct element next
Elements

bool search(int value, Elements c) Elements
elemfor (elem c c ! NULL
elem elem-gtnext) if (elem-gtval
value) return TRUE return FALSE
potential null de-reference
7
Memory leakage
typedef struct element int value struct
element next Elements

Elements reverse(Elements c)
Elements h,gh NULLwhile (c! NULL) g
c-gtnext h c c-gtnext h c
g return h

8
Memory leakage
typedef struct element int value struct
element next Elements

Elements reverse(Elements c)
Elements h,gh NULLwhile (c! NULL) g
c-gtnext h c c-gtnext h c
g return h

leakage of address pointed-by h
9
Memory leakage
typedef struct element int value struct
element next Elements

Elements reverse(Elements c)
Elements h,gh NULLwhile (c! NULL) g
c-gtnext h c c-gtnext h c
g return h

? No memory leaks
10
Example List Creation
typedef struct node int val struct
node next List
List create () List x, t x NULL while ()
do t malloc() t ?nextx x
t return x
? No null dereferences
? No memory leaks
? Returns acyclic list
11
Example Collecting Interpretation
12
Example Abstract Interpretation
13
Challenge 1 - Memory Allocation

The number of allocated objects/threads is not
known
Concrete state space is infinite
How to guarantee termination?

14
Challenge 2 - Destructive Updates

The program manipulates states using destructive
updates
e ? next t
Hard to define concrete interpretation
Harder to define abstract interpretation

15
Challenge 2 - Destructive Update
Unsound ?
16
Challenge 2 - Destructive Update
Imprecise ?
17
Challenge 3 Re-establishing Data Structure
Invariants

Data-structure invariants typically only hold at
the beginning and end of ADT operations
Need to verify that data-structure invariants are
re-established

18
Challenge 3 Re-establishing Data Structure
Invariants

rotate(List first, List last)
if ( first ! NULL)
last ? next first
first first ? next
last last ? next
last ? next NULL

19
Plan

Concrete interpretation
Canonical abstraction
Abstract interpretation using canonical
abstraction
The TVLA system

20
Traditional Heap Interpretation

States Two level stores
Env Var ? Values
fields Loc ? Values
ValuesLoc ?Atoms
Example
Env x ? 30, p ? 79
next 30 ?40, 40 ? 50, 50 ?79, 79 ? 90
val 30 ?1, 40 ? 2, 50 ?3, 79 ? 4, 90 ?5

21
Predicate Logic

Vocabulary
A finite set of predicate symbols Peach with a
fixed arity
Logical Structures S provide meaning for
predicates
A set of individuals (nodes) U
pS (US)k ? 0, 1
FOTC over TC,????? express logical structure
properties

22
Representing Stores as Logical Structures

Locations ? Individuals
Program variables ? Unary predicates
Fields ? Binary predicates
Example
U u1, u2, u3, u4, u5
x u1, p u3
n ltu1, u2gt, ltu2, u3gt, ltu3, u4gt, ltu4, u5gt

23
Formal Semantics of First Order Formulae

For a structure SltUS, pSgt
Formulae ? with LVar free variables
Assignment z LVar?US
???S(z) 0, 1

?1?S(z)1
?0?S(z)0
?p (v1, v2, , vk)?S(z)pS (z(v1), z(v2), ,
z(vk))
24
Formal Semantics of First Order Formulae

For a structure SltUS, pSgt
Formulae ? with LVar free variables
Assignment z LVar?US
???S(z) 0, 1

??1??2?S(z)max (??1 ?S(z), ??2 ?S(z))
??1??2?S(z)min (??1 ?S(z), ??2 ?S(z))
???1?S(z)1- ??1 ?S(z)
??v ?1?S(z)max ??1 ?S(zv?u) u ? US
25
Formal Semantics of Transitive Closure

For a structure SltUS, pSgt
Formulae ? with LVar free variables
Assignment z LVar?US
???S(z) 0, 1

?p(v1, v2)?S(z) max u1, ..., uk ? U,
Z(v1)u1, Z(v2)uk min1 ? i lt k
pS(ui, ui1)
26
Concrete Interpretation Rules
Statement Update formula
x NULL x(v) 0
x malloc() x(v) IsNew(v)
xy x(v) y(v)
xy ?next x(v) ?w y(w) ? n(w, v)
x ?nexty n(v, w) (?x(v)? n(v, w)) ? (x(v) ? y(w))
27
Invariants

No memory leaks?v ?x ?PVar ?w x(w) ? n(w,
v)
Acyclic list(x)?v, w x(v) ? n(v, w) ? ?n(w,
v)
Reverse (x)?v, w, r x(v) ? n(v, w) ?
n(w, r) ? n(r, w)

28
Why use logical structures?

Naturally model pointers and dynamic allocation
No a priori bound on number of locations
Use formulas to express semantics
Indirect store updates using quantifiers
Can model other features
Concurrency
Abstract fields

29
Why use logical structures?

Behaves well under abstraction
Enables automatic construction of abstract
interpreters from concrete interpretation rules
(TVLA)

30
Collecting Interpretation

The set of reachable logical structures in every
program point
Statements operate on sets of logical structures
Cannot be directly computed for programs with
unbounded store and loops

x NULL while () do t malloc()
t ?nextx x t
empty
31
Plan

Concrete interpretation
Canonical abstraction
TVLA

32
Canonical Abstraction

Convert logical structures of unbounded size into
bounded size
Guarantees that number of logical structures in
every program is finite
Every first-order formula can be conservatively
interpreted

33
Kleene Three-Valued Logic

1 True
0 False
1/2 Unknown
A join semi-lattice 0 ? 1 1/2

Logical order
34
Boolean Connectives Kleene
35
3-Valued Logical Structures

A set of individuals (nodes) U
Predicate meaning
pS (US)k ? 0, 1, 1/2

36
Canonical Abstraction

Partition the individuals into equivalence
classes based on the values of their unary
predicates
Every individual is mapped into its equivalence
class
Collapse predicates via ?
pS (u1, ..., uk) ? pB (u1, ..., uk)
f(u1)u1, ..., f(uk)uk)
At most 2A abstract individuals

37
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
u1
u2
u3
u1
u2,3
x
t
38
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
n
n
u2
u1
u3
x
t
39
Canonical Abstraction and Equality

Summary nodes may represent more than one
element
(In)equality need not be preserved under
abstraction
Explicitly record equality
Summary nodes are nodes with eq(u, u)1/2

40
Canonical Abstraction and Equality
eq
eq
eq
x NULL while () do t malloc()
t ?nextx x t
n
n
eq
u1
u2
u3
eq
x
t
eq
eq
eq
eq
n
u2,3
u1
u2,3
x
t
n
41
Canonical Abstraction
x NULL while () do t malloc()
t ?nextx x t
n
n
u1
u2
u3
x
t
42
Challenges Heap ConcurrencyYahav POPL01

Concurrency with the heap is evil
Java threads are just heap allocated objects
Data and control are strongly related
Thread-scheduling info may require understanding
of heap structure (e.g., scheduling queue)
Heap analysis requires information about thread
scheduling

Thread t1 new Thread() Thread t2 new
Thread() t t1 t.start()
43
Configurations Example
held_by
atl_C
atl_1
rvalmyLock
rvalmyLock
blocked
atl_1
atl_0
atl_0
rvalmyLock
l_0 while (true) l_1 synchronized(myLock)
l_C // critical actions l_2 l_3
44
Concrete Configuration
held_by
atl_1
atl_C
rvalmyLock
blocked
rvalmyLock
atl_1
atl_0
atl_0
rvalmyLock
45
Abstract Configuration
held_by
blocked
atl_C
atl_1
rvalmyLock
rvalmyLock
atl_0
46
Examples Verified
Program Property
twoLock Q No interference No memory leaks Partial correctness
Producer/consumer No interference No memory leaks
Apprentice Challenge Counter increasing
Dining philosophers with resource ordering Absence of deadlock
Mutex Mutual exclusion
Web Server No interference
47
Summary

Canonical abstraction guarantees finite number of
structures
The concrete location of an object plays no
significance
But what is the significance of 3-valued logic?

48
Topics

Embedding
Instrumentation
Abstract Interpretation
Extensions

49
Embedding
50
Embedding

B ?f S
onto function f
pB(u1, .., uk) ? pS (f(u1), ..., f(uk))
S is a tight embedding of B with respect to f if
pS(u1, .., uk) ?pB (u1 ..., uk) f(u1)u1,
..., f(uk)uk
Canonical Abstraction is a tight embedding

51
Embedding (cont)

S1 ?f S2 ? every concrete state represented by S1
is also represented by S2
The set of nodes in S1 and S2 may be different
No meaning for node names (abstract locations)
?(S) S 2-valued structure S, S ?f S

52
Embedding Theorem

Assume B ?f S, pB(u1, .., uk) ? pS
(f(u1), ..., f(uk))
Then every formula ? is preserved
If ??? 1 in S, then ??? 1 in B
If ??? 0 in S, then ??? 0 in B
If ??? 1/2 in S, then ??? could be 0 or 1 in B

53
Embedding Theorem

For every formula ? is preserved
If ??? 1 in S, then ??? 1 for all B??(S)
If ??? 0 in S, then ??? 0 for all B??(S)
If ??? 1/2 in S, then ??? could be 0 or 1 in
?(S)

54
Challenge 2 - Destructive Update
x
n
p
y
y?next NULL
n(v, w) ?y(v)? n(v, w)
Sound ?
55
Challenge 2 - Destructive Update
x
n
p
y
y?next NULL
n(v, w) ? y(v)? n(v, w)
Sound ?
56
Embedding Theorem
?v x(v)
1Yes
?v x(v)?t(v)
1Yes
?v x(v)?y(v)
0No
?v,w x(v)?n(v, w)
½Maybe
?v, w x(v)?n(v, w) ?n(v, w)
0No
?v,w x(v) ? n(v,w) ? n(w, w)
1/2Maybe
57
Summary

The embedding theorem eliminates the need for
proving near commutavity
Guarantees soundness
Applied to arbitrary logics
But can be imprecise

58
Limitations

Information on summary nodes is lost
Leads to useless verification

59
Increasing Precision

User (Programming Language) supplied global
invariants
Naturally expressed in FOTC
Record extra information in the concrete
interpretation
Tune the abstraction
Refine concretization

60
Cyclicity predicate
cx() ?v1,v2 x(v1) ? n(v1,v2) ? n(v2, v2)
cx()0

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
cx()0
t
n
61
Cyclicity predicate
cx() ?v1,v2 x(v1) ? n(v1,v2) ? n(v2, v2)
n
cx()1

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
cx()1
t
n
62
Heap Sharing predicate
is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v)0
is(v)0
is(v)0

u1
u2
un
x
n
n
n
t
n
u2..n
u1
x
t
n
is(v)0
is(v)0
63
Heap Sharing predicate
is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v)0
is(v)1
is(v)0

u1
u2
un
x
n
n
n
t
n
64
Concrete Interpretation Rules
Statement Update formula
x NULL x(v) 0
x malloc() x(v) IsNew(v)
xy x(v) y(v)
xy ?next x(v) ?w y(w) ? n(w, v)
x ?nextNULL n(v, w) ?x(v)? n(v, w) is(v) is(v) ? ?v1, v2 n(v1, v) ?n(v2, v) ? ?x(v1) ? ?x(v2) ? ?eq(v1, v2)
65
Reachability predicate
tn(v1, v2) n(v1,v2)
u2
u1
un
x
n
n
n
t
n
u2..n
u1
x
t
n
66
Additional Instrumentation predicates

reachable-from-variable-x(v)
cfb(v) ?v1 f(v, v1) ?b(v1, v)
tree(v)
dag(v)
inOrder(v) ?v1 n(v, v1) ? dle(v,v1)
Weakest Precondition Ramalingam PLDI 02

67
Instrumentation (Summary)

Refines the abstraction
Adds global invariants
But requires update-formulas (generated
automatically in TVLA2

is(v) ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
is(v) ? ?v1,v2 n(v1,v) ? n(v2,v) ? v1 ? v2
?(S)S S ? ?, S ?f S
68
Plan

Embedding Theorem
Instrumentation
Abstract interpretation using canonical
abstraction
TVLA

69
Best Conservative Interpretation (CC79)
70
Best Transformer (x x ? n)
inverse embedding
71
Focus- Based Transformer (x x ? n)
x
y
inverse embedding
canonic abstraction
72
Focus-Based Transformer (x x ? n)
x
y
73
Semantic Reduction

Improve the precision by recovering properties of
the program semantics
A Galois connection (L1, ?, ?, L2)
An operation opL2?L2 is a semantic reduction
?l?L2 op(l)?l
?(op(l)) ?(l)
Can be applied before and after basic operations

74
Three Valued Logic Analysis (TVLA)T. Lev-Ami
R. Manevich

Input (FOTC)
Concrete interpretation rules
Definition of instrumentation predicates
Definition of safety properties
First Order Transition System (TVP)
Output
Warnings (text)
The 3-valued structure at every node (invariants)

75
Null Dereferences
bool search( int value, Element ?x) Element
? c x while ( x ! NULL ) if (c? val
value) return TRUE c c ? n return
FALSE
typedef struct element int value struct
element ?n Element
Demo
40
76
TVLA inputs

TVP - Three Valued Program
Predicate declaration
Action definitions SOS
Control flow graph
TVS - Three Valued Structure

Demo
77
Challenge 1

Write a C procedure on which TVLA reports false
null dereference

78
Proving Correctness of Sorting Implementations
(Lev-Ami, Reps, S, Wilhelm ISSTA 2000)

Partial correctness
The elements are sorted
The list is a permutation of the original list
Termination
At every loop iterations the set of elements
reachable from the head is decreased

79
Example InsertSort
List InsertSort(List x) List r, pr, rn, l,
pl r x pr NULL while (r ! NULL)
l x rn r ? n pl NULL while
(l ! r) if (l ? data gt r ? data)
pr ? n rn r ? n l
if (pl NULL) x r else pl ? n
r r pr break
pl l l l ? n
pr r r rn return x

typedef struct list_cell int data
struct list_cell n List
pred.tvp
actions.tvp
Run Demo
80
Example InsertSort
List InsertSort(List x) if (x NULL)
return NULL pr x r x-gtn while (r !
NULL) pl x rn r-gtn l x-gtn while (l
! r) pr-gtn rn r-gtn
l pl-gtn r r pr
break pl l l
l-gtn pr r r rn
typedef struct list_cell int data
struct list_cell n List
Run Demo
14
81
Example Reverse
typedef struct list_cell int data
struct list_cell n List
List reverse (List x) List y, t y
NULL while (x ! NULL) t y
y x x x ? next y ? next
t return y
Run Demo
82
Challenge

Write a sorting C procedure on which TVLA fails
to prove sortedness or permutation

83
Example Mark and Sweep
void Sweep() unexplored Universe
collected ? while (unexplored ? ?) x
SelectAndRemove(unexplored) if (x ? marked)
collected collected ? x
assert(collected Universe
Reachset(root) )
void Mark(Node root) if (root ! NULL)
pending ? pending pending ? root
marked ? while (pending ? ?)
x SelectAndRemove(pending) marked
marked ? x t x ? left if (t
? NULL) if (t ? marked)
pending pending ? t t x ? right
if (t ? NULL) if (t ? marked)
pending pending ? t
assert(marked Reachset(root))
pred.tvp
Run Demo
84
Challenge 2

Use TVLA to show termination of markAndSweep

85
Verification of Safety Properties(PLDI02, 04)

The Canvas Project (with IBM Watson)
(Component Annotation, Verification and Stuff)

Component a library with cleanly encapsulated
state
Client a program that uses the library

Lightweight Specification
"correct usage" rules a client must follow
"call open() before read()"

Certification does the client program satisfy the
lightweight specification?
86
Prototype Implementation

Applied to several example programs
Up to 5000 lines of Java
Used to verify
Absence of concurrent modification exception
JDBC API conformance
IOStreams API conformance

87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
Scaling

Staged analysis
Controlled complexity
More coarse abstractions Manevich SAS04
Handle libraries
Use procedure specificationsYorsh, TACAS04
Decision procedures for linked data
structuresImmerman, CAV04, Lev-Ami, CADE05
Handling procedures
Compute procedure summaries Jeannet, SAS04
Local heaps Rinetzky, POPL05

91
Local heaps Rinetzky, POPL05
call p(x)
y
g
t
92
Why is Heap Analysis Difficult?

Destructive updating through pointers
p?next q
Produces complicated aliasing relationships
Track aliasing on 3-valued structures
Dynamic storage allocation
No bound on the size of run-time data structures
Canonical abstraction ? finite-sized 3-valued
structures
Data-structure invariants typically only hold at
the beginning and end of operations
Need to verify that data-structure invariants are
re-established
Query the 3-valued structures that arise at the
exit

93
Summary