A Roadmap - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

A Roadmap

Description:

The 'Maximal Fixed Point' (MFP) solution. The 'Meet Over all Paths' (MOP) ... Reflexive x x. Anti-symmetric x y, y x implies x=y. Transitive x y, y z implies x z ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 31
Provided by: csr8
Category:

less

Transcript and Presenter's Notes

Title: A Roadmap


1
A Roadmap
  • Traditional Static Program Analysis
  • Theory
  • Compiler Optimizations Control Flow Graphs
  • Data-flow Analysis Data-flow framework todays
    class
  • Classic analyses and applications
  • Software Testing
  • Formal Static Program Analysis

2
Outline
  • Data-flow frameworks
  • Lattice theoretic foundations
  • Monotone frameworks
  • The Maximal Fixed Point (MFP) solution
  • The Meet Over all Paths (MOP) solution
  • Reading Compilers Principles, Techniques and
    Tools, by Aho, Lam, Sethi and Ullman, Chapter 9.3

3
Four Classical Dataflow Problems Similarities
  • There is a finite set U of dataflow facts
  • Reaching Definitions the set of all definitions
    in program
  • Available Expressions and Very Busy Expressions
    the set of all expressions in program
  • The solution at a program point i (i.e., in(i),
    out(i)) is a subset of U (e.g., for each
    definition it either reaches program point i or
    does not).

4
Similarities, continue
  • Dataflow equations are of the form
  • out(i) fi(in(i)) (in(i)-kill(i)) gen(i)
  • (in(i) pres(i))
    gen(i)
  • Also, for all four classical data-flow problems,
    sets pres(i)
  • and gen(i) have constant values --- i.e., they do
    not depend
  • on in(i). This is not true in general.
  • Set union and set intersection can be implemented
    as logical OR and AND respectively

5
Lattice Theory
  • Partial ordering (denoted by or )
  • Relation between pairs of elements
  • Reflexive x x
  • Anti-symmetric x y, y x implies xy
  • Transitive x y, y z implies x z
  • Poset (set S, )
  • 0 Element 0 x, for every x in S
  • 1 Element x 1, for every x in S

We dont necessarily need 0 and 1 element.
6
Poset Example
a,b,c
U a,b,cThe poset is 2U, is set inclusion
a,b
b,c
a,c
a b c

7
Lattice Theory
  • Greatest lower bound (glb) g of elements l1,
    l2is an element in S such that
  • (1) g l1, (2) g l2
  • (3) for any b in S, b l1, b l2 implies b g
  • If glb exists, it is unique. Why? It is called
    the meet (denoted by ? or) of l1 and l2.
  • Least upper bound (lub) l of elements l1, l2is
    an element in S such that
  • (1) l l1, (2) l l2
  • (3) for any d in S, d l1, d l2 implies d l
  • If lub exists, it is unique. It is called the
    join (denoted by V or) of l1 and l2.

8
Definition of a Lattice
  • A lattice, L, is a poset under such that every
    pair of elements has a glb (meet) and lub (join)
  • Not every poset is a lattice
  • A lattice need not contain a 0 or 1 element
  • A finite lattice must contain 0 and 1 elements
  • If a x for every x in L, then a is the 0
    element of L
  • If x a for every x in L, then a is the 1
    element of L

9
A poset but not a lattice
e4
e3
e1
e2
0
There is no lub(e3,e4) in this poset so it is not
a lattice. Even if we put a lub(e3,e4), is it
going to be a lattice?
10
Examples of Lattices
  • H (2U, n, U) where U is a finite set
  • Partial order is subset relation
  • glb(s1,s2) s1?s2 s1ns2
  • lub(s1,s2) s1Vs2 s1Us2
  • J (N1, gcd, lcm)
  • Partial order is integer divide on N1
  • glb(n1,n2) n1?n2 gcd(n1,n2)
  • lub(n1,n2) n1Vn2 lcm(n1,n2)

11
Chain
  • A poset C where for every pair of elements c1, c2
    in C, either c1 c2 or c2 c1.
  • E.g., a a,b a,b,c
  • And from the lattice J as shown here,
  • 1 2 6 30
  • 1 3 15 30

30
6
15
10
Lattices are used in dataflow analysis to reason
about the solution obtainable through fixed-point
iteration.
2
5
3
1
12
Dataflow Lattices Reaching Definitions
U all definitions(x,1),(x,4),(a,3)The poset
is 2U, is the subset relation
(x,1),(x,4),(a,3)
1
1. xab
2. if yltab
(x,1),(x,4)
(x,4),(a,3)
(x,1),(a,3)
3. aa1
(x,1) (x,4) (a,3)
4. xab
5. goto 3
0

13
Dataflow Lattices Available Expressions
U all expressions ab, a1, yzThe poset is
2U, is the superset relation

1
1. xab
ab
yz
2. if yzltab
a1
3. aa1
ab,yz ab,a1 a1,yz
4. xab
5. goto 2
0
ab,a1,yz
14
Monotone Dataflow Frameworks
  • Generic data-flow equations
  • in(i) V out(m) out(i) fi (in(i))
  • Parameters
  • Property space in(i), out(i) are elements of a
    property space
  • Combination operator V U for may problems and n
    for must problems
  • Transfer functions fi is associated with node i
  • If we instantiate these parameters in a certain
    way, then our analysis is an instance of the
    monotone dataflow framework

m in pred(i)
15
Monotone Frameworks Requirements
  • The property space
  • Is a complete lattice L under partial order
  • where L satisfies the Ascending Chain Condition
  • (i.e., all ascending chains are finite)
  • The combination operator V
  • Is the join (V, pronounced vee) of L
  • Reaching Definitions Property space? Combination
    operator?
  • Available Expressions Property space?
    Combination operator?

16
Monotone Frameworks Requirements
  • The transfer functions fi L? L
  • Formally, there is space F such that
  • F contains all fi
  • F contains the identity function id(x) x
  • F is closed under composition
  • Each fi is monotone

17
Monotonicity
  • It is defined as
  • (1) a b f(a) f(b)
  • An equivalent definitions is (2) f(x) V f(y)
    f(x V y)
  • Lemma The two definitions are equivalent.
  • First, we show that (1) implies (2).
  • Second, we show that (2) implies (1).

18
The four classical dataflow problems
Let Def denote all definitions in the program Let
2Def denote the powerset of Def
Let AExp denote all expressions in the
program. Let 2AExp denote the powerset of AExp
Reaching Definitions
Available Expressions
19
Distributivity
  • A distributive framework A monotone framework
    with distributive transfer functions f(x) V f(y)
    f(x V y).

20
Distributivity
  • Each of the four problems is an instance of a
    distributive framework.
  • First, prove monotonicity
  • Second, prove distributivity of the functions

21
Distributivity
  • Each of the four problems is an instance of a
    distributive framework.
  • First, prove monotonicity
  • if in(i) in(i) then out(i) out(i)
  • Have to show
  • if in(i) in(i) then
  • (in(i)npres(i)) U gen(i) (in(i)npres(i)) U
    gen(i)
  • Second, prove distributivity
  • ((in(i) U in(i))npres(i)) U gen(i)
  • ((in(i)npres(i)) U gen(i)) U ((in(i)npres(i)) U
    gen(i))

22
Points-to Analysis Monotone, Non-distributive
Analysis
  • Lattice The set of all points-to graphs Pt
  • is inclusion, Pt1 Pt2 if Pt1 is a subgraph of
    Pt2
  • V is union, P1 V P2 P1 U P2
  • Transfer functions are defined on four kinds of
    statements
  • (1) f(pq) is kill all points-to edges from p,
    and generate a new points-to edge from p to q
  • (2) f(pq) is kill all points-to edges from p,
    and generate new points-to edges from p to
    every x such that q points-to x
  • (3) f(pq) is kill all points to edges from p,
    and generate new points to edges from p to
    every x, such that there exists y and q points to
    y and y points to x
  • (4) f(pq) Do not perform kill. Can you think of
    a reason why? Generate new points-to edges from
    every y to every x, such that p points to y and q
    points to x.

23
Monotone non-distributive Analysis
  • First, we show that the framework is monotone,
  • I.e., for each of the four transfer functions we
    have to show that if Pt1 Pt2, then f(Pt1)
    f(Pt2)
  • Second, we show that the framework is not
    distributive
  • It is easy to show f(Pt1 V Pt2) ? f(Pt1) V f(Pt2)
  • Another example is constant propagation

24
Non-distributivity of Points-to Analysis
pxqy
pzqw
p
q
Pt1 V Pt2
x
y
z
w
What f does Adds edges from each variable that
p points to (i.e., x and z), to each variable
where q points to (i.e., y and w). 4 new edges
from x to y and w, and fromz to y and w.
pq
f(Pt1) V f(Pt2)
f(Pt1 V Pt2)
25
The Maximal Fixed Point (MFP)1
  • / Initialize to initial values /
  • in(1)InitialValue in(1) UNDEF
  • for m 2 to n do in(m) 0 in(m) Ø
  • W 1,2,,n / put every node on the worklist
    /
  • while W ? Ø do
  • remove i from W
  • out(i) fi(in (i)) outRD(i)
    inRD(i)npres(i)Ugen(i)
  • for j in successors(i) for j in
    successors(i)
  • if out(i) in(j) then
    if outRD(i) not subset of inRD (j)
  • in(j) out(i) V in(j)
    inRD(j) out(i) U inRD(j)
  • if j not in W do add j to W

1. The Least Fixed Point (LFP) actually
26
Properties of the algorithm
  • Lemma1 The algorithm terminates.
  • Sketch of the proof
  • We have ink(j) ink1(j) and since L has ACC,
    in(j) changes at most O(h) times. Thus, each j is
    put on W at most O(h) times (h is the height of
    the lattice L).
  • Complexity At each iteration, the analysis
    examines e(j)out edges. Thus, number of basic
    operations is bounded by h(e(1)oute(N)out)O(h
    E).
  • We can do better on reducible graphs.

27
Properties of the Algorithm
  • Lemma2 The algorithm computes the least solution
    of the dataflow equations.
  • For every node i MFP computes solution MFP(i)
    in(i),out(i), such that every other solution
    in(i),out(i) of the dataflow equations is
    larger than the MFP
  • Lemma3 The algorithm computes a correct (safe)
    solution.

28
Example
Solution1
Solution2
Ø
Ø
inAE(1) Ø
1. zxy
xy
outAE(1) (inAE(1)-Ez) xy
xy
inAE(2) outAE(1) V outAE(3)
xy

Ø
2. if (z gt 500)
outAE(2) inAE(2)
xy
Ø
3. skip
inout(3) outAE(2)
outAE(3) inAE(3)
Equivalent to inAE(2) xy V inAE(2) and
recall that V is n (i.e., set intersection).
That is why we needed to initialize inAE(2) and
the other initial values to the universal set of
expressions (0 of the Available Expressions
lattice), rather than to the more intuitive empty
set.
29
Meet Over All Paths (MOP) Solution1
?
n1
  • Desired dataflow information at n is obtained by
    traversing ALL PATHS from ? to n. For every path
    p(?, n1, n2 ..., nk) we compute
    fnk(fn2(fn1(init(?))))
  • The MOP at entry of n is V fnk(fn2(fn1(init(?))))
  • The MOP is the best summary of dataflow facts
    possible to compute with this static analysis

n2

nk
n
p in paths from ? to n
30
MOP vs. MFP
  • For distributive functions the dataflow analysis
    can merge paths (p1, p2), without loss of
    precision!
  • E.g., fp1(0) need not be calculated explicitly
  • MFPMOP
  • Due to Kam and Ullman, 1976,1977 This is not
    true for monotone functions.
  • Lemma 3 The MFP approximates the MOP for general
    monotone functions MFP MOP
Write a Comment
User Comments (0)
About PowerShow.com