Program Analysis via Graph Reachability - PowerPoint PPT Presentation

About This Presentation
Title:

Program Analysis via Graph Reachability

Description:

Title: Program Analysis via Graph Reachability Author: Thomas Reps Last modified by: reps Created Date: 3/24/1998 3:26:02 AM Document presentation format – PowerPoint PPT presentation

Number of Views:137
Avg rating:3.0/5.0
Slides: 92
Provided by: thomas172
Category:

less

Transcript and Presenter's Notes

Title: Program Analysis via Graph Reachability


1
Program Analysis via Graph Reachability
  • Thomas Reps
  • University of Wisconsin

http//www.cs.wisc.edu/reps/
See http//www.cs.wisc.edu/wpis/papers/tr1386.ps
2
PLDI ?00 Registration Form
  • PLDI ?00 .. ____
  • Tutorial (morning) ____
  • Tutorial (afternoon) .. ____
  • Tutorial (evening) . 0

3
1987
1993
1994
1995
1996
1997
1998
4
Applications
  • Program optimization
  • Software engineering
  • Program understanding
  • Reengineering
  • Static bug-finding
  • Security (information flow)

5
Collaborators
  • Susan Horwitz
  • Mooly Sagiv
  • Genevieve Rosay
  • David Melski
  • David Binkley
  • Michael Benedikt
  • Patrice Godefroid

6
Themes
  • Harnessing CFL-reachability
  • Exhaustive alg. ? Demand alg.

7
Program Slicing
  • The backward slice w.r.t variable v at program
    point p The program subset that may influence
    the value of
  • variable v at point p.
  • The forward slice w.r.t variable v at program
    point p
  • The program subset that may be influenced by
  • the value of variable v at point p.

8
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
9
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Backward slice with respect to printf(d\n,i)
10
Slice Extraction
int main() int i 1 while (i lt 11)
i i 1 printf(d\n,i)
Backward slice with respect to printf(d\n,i)
11
Forward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
12
Forward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Forward slice with respect to sum 0
13
What Are Slices Useful For?
  • Understanding Programs
  • What is affected by what?
  • Restructuring Programs
  • Isolation of separate computational threads
  • Program Specialization and Reuse
  • Slices specialized programs
  • Only reuse needed slices
  • Program Differencing
  • Compare slices to identify changes
  • Testing
  • What new test cases would improve coverage?
  • What regression tests must be rerun after a
    change?

14
Line-Character-Count Program
void line_char_count(FILE f) int lines
0 int chars BOOL eof_flag FALSE int
n extern void scan_line(FILE f, BOOL bptr,
int iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
15
Character-Count Program
void char_count(FILE f) int lines 0 int
chars BOOL eof_flag FALSE int n extern
void scan_line(FILE f, BOOL bptr, int
iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
16
Line-Character-Count Program
void line_char_count(FILE f) int lines
0 int chars BOOL eof_flag FALSE int
n extern void scan_line(FILE f, BOOL bptr,
int iptr) scan_line(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line(f, eof_flag, n) chars chars
n printf(lines d\n,
lines) printf(chars d\n, chars)
17
Line-Count Program
void line_count(FILE f) int lines 0 int
chars BOOL eof_flag FALSE int n extern
void scan_line2(FILE f, BOOL bptr, int
iptr) scan_line2(f, eof_flag, n) chars
n while(eof_flag FALSE) lines lines
1 scan_line2(f, eof_flag, n) chars
chars n printf(lines d\n,
lines) printf(chars d\n, chars)
18
Specialization Via Slicing
wc -lc
19
How are Slices Computed?
  • Reachability in a Dependence Graph
  • Program Dependence Graph (PDG)
  • Dependences within one procedure
  • Intraprocedural slicing is reachability in one
    PDG
  • System Dependence Graph (SDG)
  • Dependences within entire system
  • Interprocedural slicing is reachability in the SDG

20
How is a PDG Created?
  • Control Flow Graph (CFG)
  • PDG is union of
  • Control Dependence Graph
  • Flow Dependence Graph
  • computed from CFG

21
Control Flow Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
F
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
sum sum i
i i i
22
Control Dependence Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Control dependence
q is reached from p if condition p is true (T),
not otherwise.
p
q
T
Similar for false (F).
p
q
F
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
23
Flow Dependence Graph
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Flow dependence
Value of variable assigned at p may be used at q.
p
q
Enter
i 1
sum 0
printf(sum)
printf(i)
while(i lt 11)
sum sum i
i i i
24
Program Dependence Graph (PDG)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Control dependence
Flow dependence
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
25
Program Dependence Graph (PDG)
int main() int i 1 int sum 0 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
26
Backward Slice
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
27
Backward Slice (2)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
28
Backward Slice (3)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
29
Backward Slice (4)
int main() int sum 0 int i 1 while (i
lt 11) sum sum i i i
1 printf(d\n,sum) printf(d\n,i)
Enter
T
T
T
T
T
T
sum 0
i 1
printf(sum)
printf(i)
while(i lt 11)
T
T
sum sum i
i i i
30
Slice Extraction
int main() int i 1 while (i lt 11)
i i 1 printf(d\n,i)
Enter
T
T
T
T
i 1
printf(i)
while(i lt 11)
T
i i i
31
CodeSurfer
32
(No Transcript)
33
CodeSurfer
34
(No Transcript)
35
Browsing a Dependence Graph
Pretend this is your favorite browser What does
clicking on a link do?
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
40
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Backward slice with respect to printf(d\n,i)
41
Interprocedural Slice
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Superfluous components included by Weisers
slicing algorithm TSE 84 Left out by algorithm
of Horwitz, Reps, Binkley PLDI 88 TOPLAS 90
42
How is an SDG Created?
  • Each PDG has nodes for
  • entry point
  • procedure parameters and function result
  • Each call site has nodes for
  • call
  • arguments and function result
  • Appropriate edges
  • entry node to parameters
  • call node to arguments
  • call node to entry node
  • arguments to parameters

43
System Dependence Graph (SDG)
Enter main
Call p
Call p
Enter p
44
SDG for the Sum Program
Enter main
while(i lt 11)
sum 0
i 1
printf(sum)
printf(i)
Call add
Call add
yin i
xin sum
sum xout
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
45
Interprocedural Backward Slice
Enter main
Call p
Call p
Enter p
46
Interprocedural Backward Slice (2)
Enter main
Call p
Call p
Enter p
47
Interprocedural Backward Slice (3)
Enter main
Call p
Call p
Enter p
48
Interprocedural Backward Slice (4)
Enter main
Call p
Call p
Enter p
49
Interprocedural Backward Slice (5)
Enter main
Call p
Call p
Enter p
50
Interprocedural Backward Slice (6)
Enter main
Call p
Call p
Enter p
51
Matched-Parenthesis Path
52
Interprocedural Backward Slice (6)
Enter main
Call p
Call p
Enter p
53
Interprocedural Backward Slice (7)
Enter main
Call p
Call p
Enter p
54
Slice Extraction
Enter main
Call p
Enter p
55
Slice of the Sum Program
Enter main
while(i lt 11)
i 1
printf(i)
Call add
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
56
CFL-ReachabilityYannakakis 90
  • G Graph (N nodes, E edges)
  • L A context-free language
  • L-path from s to t iff
  • Running time O(N 3)

57
Interprocedural Slicingvia CFL-Reachability
  • Graph System dependence graph
  • L L(matched) roughly
  • Node m is in the slice w.r.t. n iff there is an
    L(matched)-path from m to n

58
Asymptotic Running Time Reps, Horwitz, Sagiv,
Rosay 94
  • CFL-reachability
  • System dependence graph N nodes, E edges
  • Running time O(N 3)
  • System dependence graph Special structure

Running time O(E CallSites MaxParams3)
59
(No Transcript)
60
Regular-Language ReachabilityYannakakis 90
  • G Graph (N nodes, E edges)
  • L A regular language
  • L-path from s to t iff
  • Running time O(NE)
  • Ordinary reachability ( transitive closure)
  • Label each edge with e
  • L is e

61
CFL-Reachability via Dynamic Programming
Graph
Grammar
B
C
62
Degenerate Case CFL-Recognition
exp ? id exp exp exp exp ( exp )
?
(a b) c ? L(exp) ?
63
Degenerate Case CFL-Recognition
exp ? id exp exp exp exp ( exp )
a b) c ? L(exp) ?
64
Program Chopping
Given source S and target T, what program points
transmit effects from S to T?
Intersect forward slice from S with backward
slice from T, right?
65
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
66
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Forward slice with respect to sum 0
67
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
68
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Backward slice with respect to printf(d\n,i)
69
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
Forward slice with respect to sum 0
?
Backward slice with respect to printf(d\n,i)
70
Non-Transitivity and Slicing
int main() int sum 0 int i 1 while (i
lt 11) sum add(sum,i) i
add(i,1) printf(d\n,sum) printf(d\n,i
)
int add(int x, int y) return x y
?
Chop with respect to sum 0 and
printf(d\n,i)
71
Non-Transitivity and Slicing
Enter main
while(i lt 11)
sum 0
i 1
printf(sum)
printf(i)
Call add
Call add
yin i
xin sum
sum xout
xin i
yin 1
i xout
Enter add
x xin
y yin
x x y
xout x
72
Program Chopping
Given source S and target T, what program points
transmit effects from S to T?
Precise interprocedural chopping Reps Rosay
FSE 95
73
1987
1993
1994
1995
1996
1997
1998
74
Dataflow Analysis
  • Goal For each point in the program, determine a
    superset of the facts that could possibly hold
    during execution
  • Examples
  • Constant propagation
  • Reaching definitions
  • Live variables
  • Possibly uninitialized variables

75
Possibly Uninitialized Variables

w,x,y
w,y
w,y
w,y
w
w,y

76
Precise Intraprocedural Analysis
C
pfp fk ? fk-1 ? ? f2 ? f1
MOPn ? pfp(C)
p?PathsTon
77
if . . .
78
Precise Interprocedural Analysis
ret
C
n
start
MOMPn ? pfp(C)
p?MatchedPathsTon
Sharir Pnueli 81
79
Representing Dataflow Functions
Identity Function
a
b
c
Constant Function
80
Representing Dataflow Functions
a
b
c
Gen/Kill Function
a
b
c
Non-Gen/Kill Function
81
if . . .
82
Composing Dataflow Functions
83
x
y
a
b
if . . .
84
matched ? matched matched
(i matched )i 1 ? i ? CallSites
edge ?
85
unbalLeft ? matched unbalLeft
(i unbalLeft 1 ? i ? CallSites
?
86
Interprocedural Dataflow Analysisvia
CFL-Reachability
  • Graph Exploded control-flow graph
  • L L(unbalLeft)
  • Fact d holds at n iff there is an
    L(unbalLeft)-path from

87
Asymptotic Running Time Reps, Horwitz, Sagiv
95
  • CFL-reachability
  • Exploded control-flow graph ND nodes
  • Running time O(N3D3)
  • Exploded control-flow graph Special
    structure

Running time O(ED3)
Typically E l N, hence O(ED3) l O(ND3)
Gen/kill problems O(ED)
88
Why Bother?Were only interested in
million-line programs
  • Know thy enemy!
  • Any algorithm must do these operations
  • Avoid pitfalls (e.g., claiming O(N2) algorithm)
  • The essence of context sensitivity
  • Special cases
  • Gen/kill problems O(ED)
  • Compression techniques
  • Basic blocks
  • SSA form, sparse evaluation graphs
  • Demand algorithms

89
Unifying Conceptual Modelfor Dataflow-Analysis
Literature
  • Linear-time gen-kill Hecht 76, Kou 77
  • Path-constrained DFA Holley Rosen 81
  • Linear-time GMOD Cooper Kennedy 88
  • Flow-sensitive MOD Callahan 88
  • Linear-time interprocedural gen-kill
  • Knoop Steffen 93
  • Linear-time bidirectional gen-kill Dhamdhere 94
  • Relationship to interprocedural DFA
  • Sharir Pneuli 81, Knoop Steffen 92

90
Themes
  • Harnessing CFL-reachability
  • Exhaustive alg. ? Demand alg.

91
Exhaustive Versus Demand Analysis
  • Exhaustive analysis All facts at all points
  • Optimization Concentrate on inner loops
  • Program-understanding tools Only some facts are
    of interest

92
Exhaustive Versus Demand Analysis
  • Demand analysis
  • Does a given fact hold at a given point?
  • Which facts hold at a given point?
  • At which points does a given fact hold?
  • Demand analysis via CFL-reachability
  • single-source/single-target CFL-reachability
  • single-source/multi-target CFL-reachability
  • multi-source/single-target CFL-reachability

93
if . . .
94
Experimental ResultsHorwitz , Reps, Sagiv
1995
  • 53 C programs (200-6,700 lines)
  • For a single fact of interest
  • demand always better than exhaustive
  • All appropriate demands beats exhaustive when
    percentage of yes answers is high
  • Live variables
  • Truly live variables
  • Constant predicates
  • . . .

95
A Related Result Sagiv, Reps, Horwitz 1996
  • Uses a generalized analysis technique
  • 38 C programs (300-6,000 lines)
  • copy-constant propagation
  • linear-constant propagation
  • All appropriate demands always beats exhaustive
  • factor of 1.14 to about 6

96
Exhaustive Versus Demand Analysis
  • Demand algorithms for
  • Interprocedural dataflow analysis
  • Set constraints
  • Points-to analysis

97
Most Significant Contributions 1987-2000
  • Asymptotically fastest algorithms
  • Interprocedural slicing
  • Interprocedural dataflow analysis
  • Demand algorithms
  • Interprocedural dataflow analysis CC94,FSE95
  • All appropriate demands beats exhaustive
  • Tool for slicing and browsing ANSI C
  • Slices programs as large as 75,000 lines
  • University research distribution
  • Commercial product CodeSurfer (GrammaTech,
    Inc.)

98
References
  • Papers by Reps and collaborators
  • http//www.cs.wisc.edu/reps/
  • CFL-reachability
  • Yannakakis, M., Graph-theoretic methods in
    database theory, PODS 90.
  • Reps, T., Program analysis via graph
    reachability, Inf. and Softw. Tech. 98.

99
References
  • Slicing, chopping, etc.
  • Horwitz, Reps, Binkley, TOPLAS 90
  • Reps, Horwitz, Sagiv, Rosay, FSE 94
  • Reps Rosay, FSE 95
  • Dataflow analysis
  • Reps, Horwitz, Sagiv, POPL 95
  • Horwitz, Reps, Sagiv, FSE 95, TR-1283
Write a Comment
User Comments (0)
About PowerShow.com