Pointer analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Pointer analysis

Description:

uses less memory (memory can be a big bottleneck to running ... declare intent using rules. execution engine takes care of the rest. Solution: three techniques ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 91
Provided by: csewe4
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: Pointer analysis


1
Pointer analysis
2
Flow insensitive loss of precision
S1 l new Cons
Flow-insensitive Soln (Andersen)
Flow-sensitive Soln
p l
t
p
l
S1
S2
S2 t new Cons
t
p
l
S1
S2
p t
t
p
l
S1
S2
p t
t
p
l
S1
S2
3
Flow insensitive loss of precision
  • Flow insensitive analysis leads to loss of
    precision!

main() x y ... x z
Flow insensitive analysis tells us that x may
point to z here!
  • However
  • uses less memory (memory can be a big bottleneck
    to running on large programs)
  • runs faster

4
Worst case complexity of Andersen
x
y
x
y
x y
a
b
c
d
e
f
a
b
c
d
e
f
  • Worst case N2 per statement, so at least N3 for
    the whole program. Andersen is in
  • fact O(N3)

5
New idea one successor per node
  • Make each node have only one successor.
  • This is an invariant that we want to maintain.

x
y
x
y
x y
a,b,c
d,e,f
a,b,c
d,e,f
6
More general case for x y
x
y
x y
7
More general case for x y
8
Handling x y
x
y
x y
9
Handling x y
10
Handling x y (what about y x?)
x
y
x y
Handling x y
x
y
x y
11
Handling x y (what about y x?)
get the same for y x
Handling x y
12
Our favorite example, once more!
S1 l new Cons
1
p l
2
S2 t new Cons
3
p t
4
p t
5
13
Our favorite example, once more!
l
l
p
1
2
S1 l new Cons
1
S1
S1
3
p l
2
l
t
p
l
t
p
4
S2 t new Cons
3
S1
S2
S1
S2
5
p t
4
l
t
p
l
t
p
p t
5
S1
S2
S1,S2
14
Flow insensitive loss of precision
Flow-insensitive Unification- based
S1 l new Cons
Flow-sensitive Subset-based
Flow-insensitive Subset-based
p l
t
p
l
S1
S2
S2 t new Cons
t
p
l
t
p
l
S1
S2
p t
t
p
S1,S2
l
S1
S2
p t
t
p
l
S1
S2
15
Another example
bar() i a j b foo(i)
foo(j) // i pnts to what? i ...
void foo(int p) printf(d,p)
1
2
3
4
16
Another example
p
bar() i a j b foo(i)
foo(j) // i pnts to what? i ...
void foo(int p) printf(d,p)
i
i
j
i
j
1
2
3
1
2
a
a
b
a
b
3
4
4
p
p
i
j
i,j
a
b
a,b
17
Steensgaard beyond
  • A well engineered implementation of Steensgaard
    ran on Word97 (2.1 MLOC) in 1 minute.
  • One Level Flow (Das PLDI 00) is an extension to
    Steensgaard that gets more precision and runs in
    2 minutes on Word97.

18
Correctness
19
Compilers have many bugs
Searched for incorrect and wrong in the
gcc-bugs mailing list. Some of the results
  • Bug middle-end/19650 New miscompilation of
    correct code
  • Bug c/19731 arguments incorrectly named in
    static member specialization
  • Bug rtl-optimization/13300 Variable incorrectly
    identified as a biv
  • Bug rtl-optimization/16052 strength reduction
    produces wrong code
  • Bug tree-optimization/19633 local address
    incorrectly thought to escape
  • Bug target/19683 New MIPS wrong-code for
    64-bit multiply
  • Bug c/19605 Wrong member offset in inherited
    classes
  • Bug java/19295 4.0 regression Incorrect
    bytecode produced for bitwise AND

Total of 545 matches And this is only for one
month! On a mature compiler!
20
Compiler bugs cause problems
Compiler
Exec
  • They lead to buggy executables
  • They rule out having strong guarantees about
    executables

21
The focus compiler optimizations
  • A key part of any optimizing compiler

22
The focus compiler optimizations
  • A key part of any optimizing compiler
  • Hard to get optimizations right
  • Lots of infrastructure-dependent details
  • There are many corner cases in each optimization
  • There are many optimizations and they interact in
    unexpected ways
  • It is hard to test all these corner cases and all
    these interactions

23
Goals
  • Make it easier to write compiler optimizations
  • student in an undergrad compiler course should be
    able to write optimizations
  • Provide strong guarantees about the correctness
    of optimizations
  • automatically (no user intervention at all)
  • statically (before the opts are even run once)
  • Expressive enough for realistic optimizations

24
The Rhodium work
  • A domain-specific language for writing
    optimizations Rhodium
  • A correctness checker for Rhodium optimizations
  • An execution engine for Rhodium optimizations
  • Implemented and checked the correctness of a
    variety of realistic optimizations

25
Broader implications
  • Many other kinds of program manipulatorscode
    refactoring tools, static checkers
  • Rhodium work is about program analyses and
    transformations, the core of any program
    manipulator
  • Enables safe extensible program manipulators
  • Allow end programmers to easily and safely extend
    program manipulators
  • Improve programmer productivity

26
Outline
  • Introduction
  • Overview of the Rhodium system
  • Writing Rhodium optimizations
  • Checking Rhodium optimizations
  • Discussion

27
Rhodium system overview
Written by the Rhodium team
Rhodium Execution engine
Checker
Written by programmer
28
Rhodium system overview
Written by the Rhodium team
Rhodium Execution engine
Checker
Written by programmer
29
Rhodium system overview
Rdm Opt
Rdm Opt
Rdm Opt
30
Rhodium system overview
Compiler
Rhodium Execution engine
Exec
Rdm Opt
Rdm Opt
Rdm Opt
31
The technical problem
  • Tension between
  • Expressiveness
  • Automated correctness checking
  • Challenge develop techniques
  • that will go a long way in terms of
    expressiveness
  • that allow correctness to be checked

32
Solution three techniques
Rdm Opt
Verification Task
Checker
Show that for any original program behavior
of original program behavior
of optimized program
Verification Task
33
Solution three techniques
Rdm Opt
Verification Task
Verification Task
34
Solution three techniques
Rdm Opt
Verification Task
Verification Task
35
Solution three techniques
Rdm Opt
  • Rhodium is declarative
  • declare intent using rules
  • execution engine takes care of the rest

36
Solution three techniques
Rdm Opt
  • Rhodium is declarative
  • declare intent using rules
  • execution engine takes care of the rest

37
Solution three techniques
Heuristics not affecting correctness
Part that must be reasoned about
Rdm Opt
  • Rhodium is declarative
  • Factor out heuristics
  • legal transformations
  • vs. profitable transformations

38
Solution three techniques
Heuristics not affecting correctness
Part that must be reasoned about
  • Rhodium is declarative
  • Factor out heuristics
  • legal transformations
  • vs. profitable transformations

39
Solution three techniques
opt-dependent
  • Rhodium is declarative
  • Factor out heuristics
  • Split verification task
  • opt-dependent
  • vs. opt-independent

opt-independent
40
Solution three techniques
  • Rhodium is declarative
  • Factor out heuristics
  • Split verification task
  • opt-dependent
  • vs. opt-independent

41
Solution three techniques
  • Rhodium is declarative
  • Factor out heuristics
  • Split verification task
  • opt-dependent
  • vs. opt-independent

42
Solution three techniques
  1. Rhodium is declarative
  2. Factor out heuristics
  3. Split verification task
  • Result
  • Expressive language
  • Automated correctness checking

43
Outline
  • Introduction
  • Overview of the Rhodium system
  • Writing Rhodium optimizations
  • Checking Rhodium optimizations
  • Discussion

44
MustPointTo analysis
a b
c a
d c
d b
45
MustPointTo info in Rhodium
a b
c a
d c
46
MustPointTo info in Rhodium
a b
a b
c a
c a
d c
d c
47
MustPointTo info in Rhodium
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
Fact correct on edge if
whenever program execution reaches edge, meaning
of fact evaluates to true in the program state
c a
d c
48
Propagating facts
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
c a
d c
49
Propagating facts
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
c a
d c
50
Propagating facts
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
c a
d c
51
Propagating facts
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
mustPointTo (a, b)
mustPointTo (a, b)
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then mustPointTo(Z,Y)_at_out
c a
c a
mustPointTo (a, b)
mustPointTo (c, b)
mustPointTo (c, b)
d c
52
Propagating facts
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then mustPointTo(Z,Y)_at_out
c a
d c
53
Transformations
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then mustPointTo(Z,Y)_at_out
c a
mustPointTo (a, b)
mustPointTo (c, b)
mustPointTo (c, b)
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then transform to Z Y
d c
d c
d b
54
Transformations
define fact mustPointTo(XVar,YVar) with
meaning X Y
a b
if currStmt X Y then mustPointTo(X,Y)_at_ou
t
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then mustPointTo(Z,Y)_at_out
c a
mustPointTo (a, b)
mustPointTo (c, b)
if mustPointTo(X,Y)_at_in Æ currStmt Z
X then transform to Z Y
d c
d b
55
Profitability heuristics
Legal transformations
(identified by the Rhodium rules)
Profitability Heuristics
Subset of legal transformations
(actually performed)
56
Profitability heuristic example 1
  • Inlining
  • Many heuristics to determine when to inline a
    function
  • compute function sizes, estimate code-size
    increase, estimate performance benefit
  • maybe even use AI techniques to make the decision
  • However, these heuristics do not affect the
    correctness of inlining
  • They are just used to choose which of the correct
    set of transformations to perform

57
Profitability heuristic example 2
  • Partial redundancy elimination (PRE)

a ... b ... if (...) a ...
x a b else ...
x a b
58
Profitability heuristic example 2
  • PRE as code duplication followed by CSE

a ... b ... if (...) a ...
x a b else ... x a b
  • Code duplication

x a b
59
Profitability heuristic example 2
  • PRE as code duplication followed by CSE

a ... b ... if (...) a ...
x a b else ... x
  • Code duplication
  • CSE

x a b
a b
x
60
Profitability heuristic example 2
  • PRE as code duplication followed by CSE

a ... b ... if (...) a ...
x a b else ... x
  • Code duplication
  • CSE
  • self-assignment removal

x a b
x
61
Profitability heuristic example 2
Legal placements of x a b
Profitable placement
a ... b ... if (...) a ...
x a b else ... x a b
62
Semantics of a Rhodium opt
  • Run propagation rules in a loop until there are
    no more changes (optimistic iterative analysis)
  • Then run transformation rules to identify the set
    of legal transformations
  • Then run profitability heuristics to determine
    set of transformations to perform

63
More facts
define fact mustNotPointTo(XVar,YVar) with
meaning X ? Y
define fact doesNotPointIntoHeap(XVar) with
meaning X null Ç 9 YVar . X Y
define fact hasConstantValue(XVar,CConst) with
meaning X C
64
More rules
if currStmt X A Æ mustNotPointToHeap(A)
_at_in Æ 8 BVar . mayPointTo(A,B)_at_in )
mustNotPointTo(B,Y) then mustNotPointTo(X,Y)_at_out
if currStmt Y I BE Æ varEqualArray(X,A
,J)_at_in Æ equalsPlus(J,I,BE)_at_in Æ mayDef(X) Æ
mayDefArray(A) Æ unchanged(BE) then
varEqualArray(X,A,Y)_at_out
65
More in Rhodium
  • More powerful pointer analyses
  • Heap summaries
  • Analyses across procedures
  • Interprocedural analyses
  • Analyses that dont care about the order of
    statements
  • Flow-insensitive analyses

66
Outline
  • Introduction
  • Overview of the Rhodium system
  • Writing Rhodium optimizations
  • Checking Rhodium optimizations
  • Discussion

67
Rhodium correctness checker
Rdm Opt
68
Rhodium correctness checker
Rdm Opt
69
Rhodium correctness checker
Rdm Opt
Checker
Automatic theorem prover
70
Rhodium correctness checker
Rhodium optimization
define fact
if then transform
if then
Profitability heuristics
Checker
Automatic theorem prover
71
Rhodium correctness checker
Rhodium optimization
define fact
if then transform
if then
Checker
Automatic theorem prover
72
Rhodium correctness checker
Rhodium optimization
Opt-independent
define fact
if then
if then transform
Checker
VCGen
VCGen
LocalVC
LocalVC
Opt-dependent
Automatic theorem prover
73
Local verification conditions
define fact mustPointTo(X,Y) with meaning X
Y
Local VCs (generated and proven automatically)
74
Local correctness of prop. rules
Local VC (generated and proven automatically)
define fact mustPointTo(X,Y) with meaning X
Y
Assume
All incoming facts are correct
if mustPointTo(X,Y)_at_in Æ
currStmt Z X
Propagated factis correct
Show
then mustPointTo(Z,Y)_at_out
75
Local correctness of prop. rules
Local VC (generated and proven automatically)
define fact mustPointTo(X,Y) with meaning X
Y
?in
Z X
?out
76
Local correctness of trans. rules
Local VC (generated and proven automatically)
define fact mustPointTo(X,Y) with meaning X
Y
if mustPointTo(X,Y)_at_in Æ
currStmt Z X
then transform to Z Y
Z X
Z Y
77
Local correctness of trans. rules
Local VC (generated and proven automatically)
define fact mustPointTo(X,Y) with meaning X
Y
if mustPointTo(X,Y)_at_in Æ
currStmt Z X
then transform to Z Y
Z X
Z X
Z Y
Z Y
?out
?out
?
78
Outline
  • Introduction
  • Overview of the Rhodium system
  • Writing Rhodium optimizations
  • Checking Rhodium optimizations
  • Discussion

79
Topics of Discussion
  • Correctness guarantees
  • Usefulness of the checker
  • Expressiveness

80
Correctness guarantees
  • Guarantees
  • Usefulness
  • Expressiveness
  • Once checked, optimizations are guaranteed to be
    correct
  • Caveat trusted computing base
  • execution engine
  • checker implementation
  • proofs done by hand once
  • Adding a new optimization does not increase the
    size of the trusted computing base

81
Usefulness of the checker
  • Guarantees
  • Usefulness
  • Expressiveness
  • Found subtle bugs in my initial implementation of
    various optimizations

define fact equals(XVar, EExpr) with
meaning X E
if currStmt X E then equals(X,E)_at_out
82
Usefulness of the checker
  • Guarantees
  • Usefulness
  • Expressiveness
  • Found subtle bugs in my initial implementation of
    various optimizations

define fact equals(XVar, EExpr) with
meaning X E
if currStmt X E then equals(X,E)_at_out
if currStmt X E Æ X does not appear in
E then equals(X,E)_at_out
83
Usefulness of the checker
  • Guarantees
  • Usefulness
  • Expressiveness
  • Found subtle bugs in my initial implementation of
    various optimizations

define fact equals(XVar, EExpr) with
meaning X E
x x 1
x x 1
x y 1
if currStmt X E Æ X does not appear in
E then equals(X,E)_at_out
if currStmt X E Æ E does not use
X then equals(X,E)_at_out
equals (x , x 1)
equals (x , y 1)
84
Rhodium expressiveness
  • Guarantees
  • Usefulness
  • Expressiveness
  • Traditional optimizations
  • const prop and folding, branch folding, dead
    assignment elim, common sub-expression elim,
    partial redundancy elim, partial dead assignment
    elim, arithmetic invariant detection, and integer
    range analysis.
  • Pointer analyses
  • must-point-to analysis, Andersen's may-point-to
    analysis with heap summaries
  • Loop opts
  • loop-induction-variable strength reduction, code
    hoisting, code sinking
  • Array opts
  • constant propagation through array elements,
    redundant array load elimination

85
Expressiveness limitations
  • Guarantees
  • Usefulness
  • Expressiveness
  • May not be able to express your optimization in
    Rhodium
  • opts that build complicated data structures
  • opts that perform complicated many-to-many
    transformations (e.g. loop fusion, loop
    unrolling)
  • A correct Rhodium optimization may be rejected by
    the correctness checker
  • limitations of the theorem prover
  • limitations of first-order logic

86
Lessons learned (discussion)
87
Lessons learned (my answers)
  • Capture structure of problem
  • Rhodium flow functions, rewrite rules, prof.
    heuristics
  • Restricts the programmer, but can lead to better
    reasoning abilities
  • Split correctness-critical code from rest
  • Split verification task
  • meta-level vs. per-verification
  • between analysis tool and theorem prover
  • between human and theorem prover

88
Lessons learned (my answers)
  • DSL design is an iterative process
  • Hard to see best design without trying something
    first
  • Previous version of Rhodium was called Cobalt
  • Cobalt was based on temporal logic
  • Stepping stone towards Rhodium

89
Lessons learned (my answers)
  • One of the gotchas is efficient execution
  • easier to reason about automatically does not
    always mean easier to execute efficiently
  • can possibly recover efficiency with hints from
    users
  • how can you trust a complex execution engine?
  • Rely on annotations?
  • meanings in Rhodium
  • May be ok, especially if annotations simply state
    what the programmer is already thinking

90
Conclusion
  • Rhodium system
  • makes it easier to write optimizations
  • provides correctness guarantees
  • is expressive enough for realistic optimizations
  • Rhodium is an example of using a DSL to allow
    more precise reasoning
Write a Comment
User Comments (0)
About PowerShow.com