Title: Prioritizing Constraint Evaluation for Efficient Points-to Analysis
1Prioritizing Constraint Evaluation for Efficient
Points-to Analysis
- Rupesh Nasre, R. Govindarajan
- Department of Computer Science and Automation
- Indian Institute of Science, Bangalore, India
- CGO 2011
- Apr 06, 2011
2Placement of Pointer Analysis.
Improved runtime.
Parallelizing compiler.
Lock synchronizer.
Memory leak detector.
Secure code.
Pointer Analysis.
String vulnerability finder.
Data flow analyzer.
Better compile time.
Affine expression analyzer.
Type analyzer.
Program slicer.
Better debugging.
3Points-to Analysis as a Graph Problem
Each pointer as a node, directed edge p ? q
indicates points-to set of q is a subset of that
of p.
- Input set C of points-to constraints
- Process address-of constraints
- Add edges to constraint graph G using copy
constraints - repeat
- Propagate points-to information in G
- Add edges to G using load and store
constraints - until fixpoint
Literature focuses here.
Our work deals here.
4Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
a aqrst
b
qrst
c
p bcd
d
e
5Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 0
Fixed processing order e d b a --------- e
c c a a p
a aqrst
b
qrst
c
p bcd
d
e
6Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 1
Fixed processing order e d b a --------- e
c c a a p
a aqrst
b aqrst
qrst
c
p bcd
d
e
7Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 2
Fixed processing order e d b a --------- e
c c a a p
a abcdqrst
b abcdqrst
qrst bcd
c abcdqrst
p bcd
d
e
8Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 3
Fixed processing order e d b a --------- e
c c a a p
a abcdqrst
b abcdqrst
qrst bcd
c abcdqrst
p bcd
d bcd
e bcd
9Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 4
Fixed processing order e d b a --------- e
c c a a p
a abcdqrst
b abcdqrst
qrst bcd
c abcdqrst
p bcd
d abcdqrst
e abcdqrst
10Points-to Analysis as a Graph Problem
- e c, c a, e d, b a, a p
- Initially, a?a,q,r,s,t, p?b,c,d
Iteration 5 fixpoint
Fixed processing order e d b a --------- e
c c a a p
a abcdqrst
b abcdqrst
qrst abcdqrst
c abcdqrst
p bcd
d abcdqrst
e abcdqrst
11Related Work
- Deep and Wave Propagation, Pereira and Berlin,
CGO 2009. - Worklist Management Strategies for Dataflow
Analysis, Kanamori and Weise, MSR-TR-94-12, 1994.
These works focus on points-to information
propagation after an edge is added. Our work
deals with which constraints should be evaluated,
so that useful edges get added.
12Our Contributions
- Optimal constraint ordering is NP-Complete.
- A greedy heuristic for constraint ordering that
results in an efficient points-to analysis. - A prioritization framework for priority-based
analysis. - Detailed experimental evaluation to illustrate
the effect of prioritization.
13Order of Constraint Evaluation
- Optimal ordering is NP-Complete.
- Reduction from Set-Cover problem.
- The problem is hard even when there are no
complex constraints.
Need to depend upon heuristics.
What would be a good heuristic?
14Constraint Priority
- Priority of a constraint in iteration i is the
amount of new points-to information it adds in
iteration (i-1). - Constraints are grouped in different priority
levels which are ordered based on their priority
value. - A constraint may jump across multiple priority
levels during the analysis.
15Bucketization
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
Level 4
Level 3
Level 2
Level 1
C1
C2
Level 0
C3
C4
C5
C6
16Bucketization
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
Level 4
C5
C6
Level 3
C2
Level 2
C1
C4
Level 1
C1
C2
Level 0
C3
C3
C4
C5
C6
17Bucketization
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
C5
Level 4
C5
C6
C1
Level 3
C2
C2
Level 2
C1
C4
C4
C6
Level 1
C1
C2
Level 0
C3
C3
C3
C4
C5
C6
18Bucketization
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
C5
Level 4
C5
C6
C1
Level 3
C2
C2
Level 2
C1
C4
C4
C6
Level 1
C1
C1
C2
C2
Level 0
C3
C3
C3
C3
C4
C4
C5
C6
C5
C6
19Skewed Evaluation
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
Level 4
C5
C6
Level 3
C2
Level 2
C1
C4
Level 1
C1
C2
Level 0
C3
C3
C4
C5
C6
20Skewed Evaluation
Iteration 1
Iteration 2
Iteration 3
Iteration n
Level 5
Level 4
C5
C6
Level 3
C2
Level 2
C1
C4
Level 1
C1
C2
Level 0
C3
C3
C4
C5
C6
21Prioritized Points-to Analysis
Processing order a p (18) c a (8) e c
(0)
Fixed Processing order e c c a a p
Andersen Iteration 1
Priority Iteration 1
a aqrst
a abcdqrst
b aqrst
b abcdqrst
qrst
qrst bcd
c abcdqrst
p bcd
p bcd
d
d
e
e
22Prioritized Points-to Analysis
Processing order a p (6) c a (0) e c
(10)
Fixed Processing order e c c a a p
Priority Iteration 2
Andersen Iteration 2
a abcdqrst
a abcdqrst
b abcdqrst
b abcdqrst
qrst bcd
qrst bcd
c abcdqrst
c abcdqrst
p bcd
p bcd
d abcdqrst
d
e abcdqrst
e
23Prioritized Points-to Analysis
Processing order e c (20) a p (0) c a
(0)
Fixed Processing order e c c a a p
Priority Iteration 3
Andersen Iteration 3
a abcdqrst
a abcdqrst
b abcdqrst
b abcdqrst
qrst abcdqrst
qrst bcd
c abcdqrst
c abcdqrst
p bcd
p bcd
d abcdqrst
d bcd
e abcdqrst
e bcd
24Prioritized Points-to Analysis
Processing order e c (0) a p (0) c a (0)
Fixed Processing order e c c a a p
Priority fixpoint
Andersen Iteration 4
a abcdqrst
a abcdqrst
b abcdqrst
b abcdqrst
qrst abcdqrst
qrst bcd
c abcdqrst
c abcdqrst
p bcd
p bcd
d abcdqrst
d bcd
e abcdqrst
e bcd
25Salient Features
- Our prioritization framework allows for plugging
in a priority mechanism. - Constraints at a priority level are evaluated
repeatedly resulting in a skewed evaluation,
which achieves fixpoint faster. - Prioritized analysis is a general technique and
can be applied to other analyses.
26Evaluation
- Analysis dimensions Context-sensitive,
flow-insensitive - Framework LLVM
- Benchmarks SPEC 2000, httpd, sendmail,
ghostscript, gdb, wine-server - Analyses Prioritized
- Andersens inclusion-based analysis
- BDD-based Lazy Cycle Detection (LCD)
- Bloom-filter based approximate analysis
- Deep Propagation (context-insensitive)
27Results
Benchmark Andersen Prioritized Andersen
gcc 329 286
perlbmk 143 98
equake 24 17
art 26 19
ghostscript 4384 3183
gdb 9338 5847
average 737 495
33 improvement
Analysis time (second)
28Effect of Prioritized Points-to Analysis
vortex
The number of iterations required by prioritized
analysis to reach fixpoint is less than that by
original analysis. Prioritized analysis adds more
facts earlier than the original analysis.
29Results
Benchmark BDD LCD Prioritized BDD LCD
gcc 17411 7984
perlbmk 5879 3159
equake 4 3
art 7 4
ghostscript 20612 12371
wine-server 36 23
average 3693 1468
44 improvement
Analysis time (second)
30Results
Benchmark Bloom Filter Prioritized Bloom Filter
gcc 10237 8534
Perlbmk 2632 2144
Equake 1.1 0.86
Art 2.4 2.0
Ghostscript 2597 2101
wine-server 23 18
Average 2006 1610
20 improvement
Analysis time (second)
31Results
Deep Propagation is context-insensitive.
Benchmark Deep Prioritized Deep
gcc 1.74 1.17
Perlbmk 1.74 1.39
Equake 0.004 0.004
Art 0.004 0.004
Ghostscript 207 126
wine-server 8.1 5.4
Average 42 22
47 improvement
Analysis time (second)
32Effect of Bucketization
Increasing number of buckets improves analysis
time until saturation.
33Summary
- Prioritizing constraint evaluation order to add
more useful edges. - Optimal constraint ordering is NP-Complete.
- Amount of new points-to information added as a
greedy heuristic to determine priority. Other
heuristics can be plugged in. - A generalized framework applicable to several
analyses.
34Prioritizing Constraint Evaluation for Efficient
Points-to Analysis
- Rupesh Nasre, R. Govindarajan
- Department of Computer Science and Automation
- Indian Institute of Science, Bangalore, India
- CGO 2011
- Apr 06, 2011
35Trading off Flow-sensitivity
- Hind, Pioli, Which Pointer Analysis should I
Use?, ISSTA 2000
The use of flow-sensitive pointer analysis (as
described in this paper) does not seem justified
because it offers only a minimum increase in
precision over the analyses of Andersen and Burke
et al.
36What is Pointer Analysis?
a points to x.
a and b are aliases.
Is this condition always satisfied?
Pointer Analysis is a mechanism to statically
find out run-time values of a pointer.
We focus on C/C programs and deal with may
points-to analysis.
37Why Pointer Analysis?
- For Parallelization.
- fun(p) fun(q)
- For Optimization.
- a p 2
- b q 2
- For Bug-Finding.
- For Program Understanding.
- ...
Clients of Pointer Analysis.