Title: Ranjit Jhala Rupak Majumdar
1 Interprocedural Analysis
of
Asynchronous Programs
- Ranjit Jhala Rupak Majumdar
2Conclusions
- Boost your pet Dataflow Analysis
- to work on
- Asynchronous Programs
lets begin at the beginning
3Asynchronous Programs
client
reqs
main
global request_list r
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
v8
rr-gtnext
v3
v15
v9
reqs
reqs
4Asynchronous Programs
client
reqs
main
global request_list r
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
v8
rr-gtnext
v3
v15
v9
reqs
reqs
5Asynchronous Programs
client
reqs
main
v4
v5
v11
v0
v12
v6
Dispatch Location V3 Calls all other
functions
v1
v13
v7
reqs
v2
v14
v8
reqs
client
v3
v15
v9
reqs
reqs
6Asynchronous Program Execution
client
reqs
main
v4
r0
r!0
PC
v5
v11
v0
v12
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
7Asynchronous Program Execution
client
reqs
main
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
rcmalloc()
v6
PC
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls reqs
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
8Asynchronous Program Execution
client
reqs
main
PC
v4
r0
r!0
v5
v11
- Async calls stored in set
- Execute at dispatch loop
v0
v12
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
PC
v2
v14
Pending Calls reqs
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
9Asynchronous Program Execution
client
reqs
main
PC
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
10Asynchronous Program Execution
client
reqs
main
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
PC
v7
reqs
c-gtidid
v2
v14
Pending Calls client()
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
11Asynchronous Program Execution
client
reqs
main
PC
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls client() client()
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
12Asynchronous Program Execution
client
reqs
main
PC
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls client() client() reqs
v8
reqs
client
rr-gtnext
v3
v15
v9
reqs
reqs
13Asynchronous Program Execution
client
reqs
main
v4
r0
r!0
PC
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Order is non-deterministic
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls client() client() reqs
v8
reqs
client
rr-gtnext
PC
v3
v15
v9
reqs
reqs
14Asynchronous Program Execution
client
reqs
main
v4
r0
r!0
PC
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Order is non-deterministic
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls client() reqs
v8
reqs
client
rr-gtnext
PC
v3
v15
v9
reqs
reqs
15Asynchronous Program Execution
client
reqs
main
PC
v4
r0
r!0
v5
v11
v0
v12
- Async calls stored in set
- Execute at dispatch loop
- Order is non-deterministic
- Sync calls exec at call site
rcmalloc()
v6
rc!0
rc0
v1
v13
v7
reqs
c-gtidid
v2
v14
Pending Calls reqs
v8
reqs
client
rr-gtnext
PC
v3
v15
v9
reqs
reqs
16Asynchronous Programs
- Async calls stored in set
- Execute at dispatch loop
- Sync calls execute at call site
- Why? Latency hiding and Parallelism
- Domains
- Distributed Systems
- Web Servers
- Embedded Systems
- Discrete-event simulation
- Languages and Libraries
- Java Atomic Methods
- LibAsync, LibEvent,
- NesC
17Q How to Analyze Async Programs ?
client
reqs
main
Prove dereference of c is safe i.e. c not null at
v13
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
v1
v13
rc!0
rc0
Dataflow Facts
v7
reqs
c-gtidid
(Must) Non-null (May) Null
v2
v14
v8
r rc c
r
r
client
reqs
rr-gtnext
rc
rc
v3
v15
v9
c
c
reqs
reqs
18Verification via Dataflow Analysis
client
reqs
main
Prove Flow fact holds at v13
v4
r0
r!0
c
v5
v11
v0
v12
rcmalloc()
v6
v1
v13
rc!0
rc0
Dataflow Facts
v7
reqs
c-gtidid
(Must) Non-null (May) Null
v2
v14
v8
r rc c
r
r
client
reqs
rr-gtnext
rc
rc
v3
v15
v9
c
c
reqs
reqs
19Dataflow Analysis
client
reqs
main
1st Attempt Treat asynchronous calls as
synchronous
r
rc
v4
r0
r!0
v5
r
rc
v11
v0
v12
r
r
c
rcmalloc()
r
rc
v6
Sharir-Pnueli 80 Reps-Horwitz-Sagiv 95
v1
v13
r
rc!0
rc0
r
c
r
rc
v7
reqs
c-gtidid
v2
v14
v8
client
reqs
rr-gtnext
Verification works but unsoundly deduces
global r is non-null!
v3
v15
v9
reqs
reqs
20Dataflow Analysis
client
reqs
main
1st Attempt Treat asynchronous calls as
synchronous
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
Unsound Global r may change between call, dispatch
v1
v13
rc!0
rc0
r
c
v7
reqs
c-gtidid
v2
v14
v8
client
reqs
Idea Separately track local and global facts
rr-gtnext
v3
v15
v9
reqs
reqs
21Dataflow Analysis
client
reqs
main
2nd Attempt Only execute async calls from
dispatch location
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
v1
v13
rc!0
rc0
v7
reqs
c-gtidid
v2
v14
v8
client
reqs
rr-gtnext
v3
v15
v9
reqs
reqs
22Dataflow Analysis
client
reqs
main
2nd Attempt Only execute async calls from
dispatch location
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
Imprecise Initial value of formals ? All values
(gt) too coarse
v1
v13
rc!0
rc0
v7
reqs
c-gtidid
v2
v14
v8
client
reqs
Idea Track pending calls with formals at call-site
rr-gtnext
v3
v15
v9
reqs
reqs
23Encoding Pending Calls as Flow Facts
Idea Counters - For each kind of async call
input fact Count number of pending calls of
kind - Expanded DFA facts Dataflow facts
Counters
reqs 1 client, 5 client, 0
Idea Track pending calls with formals at call-site
c
c
24Key Combining two Analyses
Expanded DFA facts Dataflow facts Counters
r not null, rc maybe null, and pending calls
1 to reqs, 5 to client (arg non-null) 0 to
client (arg null)
reqs 1 client, 5 client, 0
c
r
rc
c
25Key Combining two Analyses
Expanded DFA facts Dataflow facts
Counters Counters Restrict analysis to valid
inter-procedural paths i.e. feasible sequences of
async calls / dispatches Dataflow facts
Perform desired analysis over restricted paths
26Dataflow Analysis
client
reqs
main
3rd Attempt Count pending calls of each kind
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
v1
v13
rc!0
rc0
v7
reqs
c-gtidid
v2
v14
reqs 1 client, 5 client, 0
v8
client
reqs
c
r
rc
rr-gtnext
v3
v15
c
v9
reqs
reqs
27Dataflow Analysis
client
reqs
main
3rd Attempt Count pending calls of each kind
v4
r0
r!0
v5
v11
v0
v12
rcmalloc()
v6
Non-Terminating Pending calls unbounded due to
recursion, loops
v1
v13
rc!0
rc0
v7
reqs
c-gtidid
v2
v14
v8
client
reqs
Idea Approximate via Abstract counters
rr-gtnext
v3
v15
v9
reqs
reqs
28Dealing with Unbounded Async Calls
Over-Approximations k1-Abstract Counters - For
each async call input fact Abstractly count
number of pending calls of each kind - Values gt
k, abstracted to infinity 1 - Finite counter
values 0,1,,k,1 - Finite DFA facts Dataflow
facts k1-Abs counters - Analysis Terminates
E.g. k 1
1
29Recall Combining two Analyses
Which interprocedural paths do k1 Abstractions
consider ?
Expanded DFA facts Dataflow facts
Counters Counters Restrict analysis to valid
inter-procedural paths i.e. feasible sequences of
async calls / dispatches Dataflow facts
Perform desired analysis over restricted paths
30Example (k1)1 Abstraction
1
2
3
4
5
6
7
8
31Example (k1)1 Abstraction
1
2
3
4
5
6
7
8
32Example (k1)1 Abstraction
1
2
3
4
5
Valid
6
7
8
9
10
33Example (k1)1 Abstraction
1
Over-Approx k1-Abstraction - Considers all valid
paths - Plus, some invalid paths - DFA on
superset of valid paths
2
3
4
5
Valid
Invalid
6
7
No matching async call
8
9
10
34Dealing with Unbounded Async Calls
Over-Approx k1-Abstraction - Considers all valid
paths - Plus, some invalid paths - DFA on
superset of valid paths
Idea How bad is over-approximation ? Find out
using under-approximation!
Over-approximate/Sound - Works for example
but imprecise in general - How to do exact DFA
over set of valid paths ?
35Computing Under-Approximate Solutions
Under-Approximations k-Abstract Counters - For
each async call input fact Abstractly count
number of pending calls of each kind - Values gt
k, abstracted to k - Effect All calls after k
are dropped - Finite counter values
0,1,,k - Finite dataflow facts k-Abs
counters, ) termination
E.g. k 1
1
36Key Combining two Analyses
Which interprocedural paths do k Abstractions
consider ?
Expanded DFA facts Dataflow facts
Counters Counters Restrict analysis to valid
inter-procedural paths i.e. feasible sequences of
async calls / dispatches Dataflow facts
Perform desired analysis over restricted paths
37Example (k1) Abstraction
1
2
3
4
5
Already (k1) pending calls with given input
fact Call at step 5 is dropped Only one pending
call !
38Example (k1) Abstraction
1
2
3
4
5
Already (k1) pending calls with given input
fact Call at step 5 is dropped Only one pending
call !
39Example (k1) Abstraction
1
2
3
4
5
6
7
8
Only one call in (k1)-Abstract pending set
None remain after this dispatch
40Example (k1) Abstraction
1
2
3
4
5
6
7
8
Only one call in (k1)-Abstract pending set
None remain after this dispatch
Exists matching async call But no more calls in
(k1)-abstract pending set !
41Example (k1) Abstraction
1
2
Valid but ignored by k-abstraction
3
4
5
6
7
8
9
?
PC
Exists matching async call But no more calls in
(k1)-abstract pending set !
42Example (k1) Abstraction
1
2
Valid but ignored by k-abstraction
3
Under-Approx k-Abstraction - Ignores all invalid
paths - and some valid paths - DFA on subset of
valid paths - Under-Approx. DFA solution
4
5
6
7
8
9
43What we have For all K
Both Computable Via Standard DFA
Sharir-Pnueli 80 Reps-Horwitz-Sagiv 95
Require Exact DFA on Valid Paths
44Increase K to Increase Precision
K
Require Exact DFA on Valid Paths
K
But how to compute exact DFA Solution ?
45Theorem There exists magic K
Require Exact DFA on Valid Paths
K1-Abstract DFA Over-Approx
K-Abstract DFA Under-Approx
Approximations Converge! To Exact DFA on Valid
Paths
46Algorithm
AsyncDFA() k 0 repeat over
DFA(k1-Counter) under DFA(k-Counter) k
k1 until (over under) return
over
Require Exact DFA on Valid Paths
DFA Interprocedural Analysis via Summaries
Sharir-Pnueli 80, Reps-Horwitz-Sagiv 95
47Proof
- Obvious ?
- Finitely many solutions monotonicity
- implies computable fixpoint
Alas, over- and under- approximations could
converge to different fixpoints
48Proof Buzzwords
- Counters are Well Quasi Ordered
- Pre exists
- Initial configurations reaching a location
- Constructable via complex backward algorithm
- Petri Nets Esparza-Finkel-Mayr 98
- Async Programs Sen-Vishwanathan 06
- Magic k exists due to existence of Pre
- Simple forward algorithm
- PLDI 04, Geeraerts-Raskin-van Begin 04
49Application Safety Verification
- Ground Dataflow Facts Predicate Abstraction
- Implemented on BLAST framework
- - Lazy Interprocedural DFA POPL 02
- Predicates automatically found via
- Counterexample Formula Interpolation POPL
04 - Reduced Product of
- Predicate Abstraction, Counter lattice FSE
05
50Preliminary Experiments
- C LibEvent Programs
- Load Balancer
- Network Simulator
- Properties
- Buffer Overflow
- Null Pointer Dereference
- Protocol State
- Several proved, bugs found
51A Few Fun Facts
- For async calls (events) exact solution
computable - Unlike threads Ramalingam 00
- Optimizations directly carry over
- Procedure summarization,
- On-the-fly exploration,
- Demand-driven,
- Proof messy but algorithm very simple
- EXPSPACE-Hard
- but early experiments cause for optimism
- magic k 1
52Conclusions
- Boost your pet Dataflow Analysis
- to work on
- Asynchronous Programs
just add counters
53 54Example (k1)1 Abstraction
1
55Example (k1)1 Abstraction
1
2
56Example (k1)1 Abstraction
1
2
3
57Example (k1)1 Abstraction
1
2
3
4
58Example (k1)1 Abstraction
1
2
3
4
5
59Example (k1)1 Abstraction
1
2
3
4
5
6
60Example (k1)1 Abstraction
1
2
3
4
5
6
7