Speeding Up Dataflow Analysis Using Flow-Insensitive Pointer Analysis - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Speeding Up Dataflow Analysis Using Flow-Insensitive Pointer Analysis

Description:

... Value Flow into a VFG Summary Application 1: ESP ESP Function Summaries Reduce Location Set ESP Example ESP Example ESP Results Application 2: ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 32
Provided by: West140
Category:

less

Transcript and Presenter's Notes

Title: Speeding Up Dataflow Analysis Using Flow-Insensitive Pointer Analysis


1
Speeding Up Dataflow Analysis Using
Flow-Insensitive Pointer Analysis
  • Stephen Adams, Tom Ball, Manuvir Das
  • Sorin Lerner, Mark Seigle
  • Westley Weimer

Microsoft Research University of Washington UC
Berkeley
2
Motivation
  • Static analysis for program verification
  • Complex dataflow analyses are popular
  • SLAM, ESP, BLAST, CQual,
  • Flow-Sensitive
  • Interprocedural
  • Expensive!
  • Cut down on data flow facts
  • Without losing anything important

3
General Idea
  • If complex analysis is worse than O(N)
  • And you have a cheap analysis that
  • Is O(N)
  • Reduces N
  • Then composing them saves time

4
Value Flow Graph (VFG)
  • Variant of a points-to graph
  • Encodes the flow of values in the program
  • Conservative approximation
  • Lightweight, fast to compute and query
  • Early queries can safely reduce
  • data-flow facts considered
  • program points considered
  • Like slicing a program wrt. value flow

5
Computing a VFG
  • Use a subtyping-based pointer analysis
  • We used One-Level Flow Das
  • Process all assignments
  • Not just those involving pointers
  • Represent constant values explicitly
  • Put them in the graph
  • Label graph with source locations
  • Encodes program slices

6
Example Points-To Graph
x
Points-to Edge
  • 1 int a, x
  • 2 x a
  • 3 x 7

a
Source Address Node
x
Expr Node
7
One Level Flow Graph
Flow Edge
x
Points-to Edge
1 int a, x 2 x a 3 x 7
a
Source Address Node
x
Expr Node
8
Value Flow Graph
2
Flow Edge
x
Points-to Edge
1 int a, x 2 x a 3 x 7
2
7
a
Source Address Node
x
Expr Node
3
2
2,3
9
VFG Properties
  • Computed in almost-linear time
  • Get points-to sets from VFG in linear time
  • Backwards reachability via flow edges
  • Gather up all variables
  • Get value flow from VFG in linear time
  • Backwards reachability via flow edges
  • Follow points-to edges up one

10
VFG Query Points-To of x
2
Flow Edge
x
Points-to Edge
1 int a, x 2 x a 3 x 7
2
7
a
Source Address Node
x
Expr Node
3
2
2,3
11
VFG Query Value Flow into a
2
Flow Edge
x
Points-to Edge
1 int a, x 2 x a 3 x 7
2
7
a
Source Address Node
x
Expr Node
3
2
2,3
12
VFG Summary
  • Computed in almost-linear time
  • Queries complete in linear time
  • Approximates flow of values in program
  • Show two applications that benefit
  • ESP
  • SLAM

13
Application 1 ESP
  • Verification tool for large C programs
  • Tracks typestate of values
  • Encoded as Finite State Machine
  • Special Error state
  • Core interprocedural data-flow engine
  • Flow sensitive state at every point
  • Performed bottom-up on call graph
  • Requires function summaries

14
ESP Function Summaries
  • Consider stateful memory locations
  • Summarize function behavior for each loc
  • Reducing number of locs would be good!
  • But C has evil casts, so types cannot be used
  • Worst case set of locations
  • All globals and formal parameters
  • Everything transitively reachable from there

15
Reduce Location Set
  • Location L needs to be considered in F if
  • Some exp E has its state changed in F
  • Value held by L at entry to F can flow into E
  • Assuming state-changing ops are known
  • Query VFG to find values that flow in

16
ESP Example
  • FILE e, f, g, h
  • void foo()
  • FILE p
  • int a (int)h
  • if () p e
  • else p f
  • p fopen()

Locations to consider for foo() summary e,
e, f, f, g, g, h, h
17
ESP Example
  • FILE e, f, g, h
  • void foo()
  • FILE p
  • int a (int)h
  • if () p e
  • else p f
  • p fopen()
  • Compute VFG
  • (2) Query value flow on p
  • (3) Reduced locations to consider for foo()
    summary e, f
  • (4) Reduce lines to consider for dataflow

18
ESP Results
  • FILE output in GCC
  • 140 KLOC, 2149 functions, 66 files, 1068 globals
  • VFG Queries take 200 seconds
  • Reduce average number of locations per function
    summary from 1100 to lt1
  • Median of 15 for functions with gt0
  • Verification takes 15 minutes
  • Infeasible otherwise

19
Application 2 SLAM
  • Validates temporal safety properties
  • Boolean abstraction
  • Interprocedural dataflow analysis
  • Counterexample-driven refinement
  • Convert C program to Boolean program
  • Exhaustive dataflow analysis
  • No errors? Program is safe.
  • Real error? Program has a bug.
  • False error? Add predicates, repeat.

20
Boolean Programs
  • int x,y
  • x 5
  • y 6
  • x x 2
  • y y 2
  • assert(xlty)

bool p,q p 1 q 1 p 0 q 0 q
1 assert(q)
p means x 5 q means x lt y
Predicates (important!)
C Program
Boolean Program
21
SLAM Predicates
  • Hard to come up with good predicates
  • Counterexample-driven refinement
  • Picks good predicates
  • Is very slow
  • Taking all possible predicates
  • Is even slower
  • Want all the useful predicates

22
Speeding Up SLAM
  • For a simple subset of C
  • Similar to Copy Constants
  • Use VFG to find a sufficient set of predicates
  • Provably sufficient for this subset
  • If this set fails to prove the real program
  • Fall back on counterexample-driven refinement

23
A Simple Language
  • s vi n // constants
  • vi vj // variable copy
  • if () s1 else s2 // condition ignored
  • vi fun(vj, ) // function call
  • return(vi) // function return
  • assert(vi vj) // safety property

24
Predicate Discovery
  • High-level idea
  • Each flow edge in the VFG means values may flow
    from X to Y
  • Add predicates to see if they do
  • For each assert(vi vj)
  • Consider the chain of values flowing to vi, vj
  • Add an equality predicate for each link
  • Use constants to resolve scoping

25
SLAM Example
  • int sel(int f)
  • int r
  • if () r f
  • else r 3
  • return(r)
  • void main()
  • int a,b,c
  • a 1
  • b sel(a)
  • if () c 2
  • else c 4
  • assert(b gt c)

a
1
f
r
3
b
4
c
2
26
Predicates For b
  • int sel(int f)
  • int r
  • if () r f
  • else r 3
  • return(r)
  • void main()
  • int a,b,c
  • a 1
  • b sel(a)
  • if () c 2
  • else c 4
  • assert(b gt c)

a
1
f
r
3
b
Predicates b r r 3 r f f a a 1
27
Predicates For b
  • int sel(int f)
  • int r
  • if () r f
  • else r 3
  • return(r)
  • void main()
  • int a,b,c
  • a 1
  • b sel(a)
  • if () c 2
  • else c 4
  • assert(b gt c)

a
1
f
r
3
b
Predicates b r r 3 r f f a // no
scope! a 1
28
Predicates For b
  • int sel(int f)
  • int r
  • if () r f
  • else r 3
  • return(r)
  • void main()
  • int a,b,c
  • a 1
  • b sel(a)
  • if () c 2
  • else c 4
  • assert(b gt c)

a
1
f
r
3
b
Predicates b r b r r 3 r 3 r
f r f f a // no scope! f 1 f
3 a 1 a 1 a 3
29
Why does this work?
  • Simple language
  • No arithmetic, etc.
  • Just copying around initial values
  • Knowing final values of variables
  • Completely decides safety condition
  • Still related to real life
  • Cannot do arithmetic on locks, FILE s, device
    driver status codes, etc.

30
Some SLAM Results
Program LOC Original Runtime Improved Runtime Generated Predicates Missing Predicates
apmbatt 2207 229s 22s 85 0
pnpmem 3849 1132s 125s 143 4
floppy 7562 1063s 600s 154 33
iscsiprt 4543 729s 146 42
Generated predicates are between all and
two-thirds of the necessary predicates. However,
since SLAM must iterate once to generate 3-7
missing predicates, the net performance increase
is more than linear. Predicates can be
specialized or simplified if the assert()
condition is a common relational operator (e.g.,
xy, xlty, x5).
31
Conclusions
  • Complex interprocedural analyses can benefit from
    inexpensive value-flow
  • VFG encodes value flow
  • Constructed and queried quickly
  • Prune the set of dataflow facts and program
    points considered
  • Large net performance increase
Write a Comment
User Comments (0)
About PowerShow.com