Data Flow Analysis 3 15-411 Compiler Design - PowerPoint PPT Presentation

About This Presentation

Title:

Data Flow Analysis 3 15-411 Compiler Design

Description:

15-411 Compiler Design Nov. 8, 2005 Key Reference on Global Optimization Gary A. Kildall, A Unified Approach to Global Program Optimization, ACM Symposium on ... – PowerPoint PPT presentation

Number of Views:62

Avg rating:3.0/5.0

Slides: 25

Provided by: Peter591

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Data Flow Analysis 3 15-411 Compiler Design

1
Data Flow Analysis 315-411 Compiler Design
Nov. 8, 2005
2
Key Reference on Global Optimization

Gary A. Kildall, A Unified Approach to Global
Program Optimization, ACM Symposium on Principles
of Programming Languages, 1973, pages 194-206.
From the abstract
A technique is presented for global analysis of
object code generated for expressions. The global
expression optimization presented includes
constant propagation, common sub-expression
elimination, elimination of redundant register
load operations and live expression analysis. A
general purpose program flow analysis algorithm
is developed which depends on an optimizing
function. The algorithm is defined formally using
a directed graph model of program flow structure
and is shown to be correct.

3
Kildalls Contribution

A number of techniques had been developed for
compile-time optimization to
? locate redundant computations,
? perform constant computations,
? reduce the number of store-load
sequences, etc.
Some provided analysis of only straight-line
sequences of instructions others tried to take
program branching into account.
Kildall gave a single unified flow analysis
algorithm which extended all the straight-line
techniques to include branching.
He stated the algorithm formally and proved it
correct in his POPL paper.

4
Constant Propagation Example program

begin
integer i, a, b, c, d, e
a 1 c0
for i 1 step 1 until 10 do
begin b 2
d a b
e b c
c 4
end
end

5
Directed Graph Representation
Nodes represent sequences of instructions with no
branches. Edges represent control flow between
nodes.
6
Constant Propagation

Convenient to associate a pool of propagated
constants with each node in the graph.
Pool is a set of ordered pairs which indicate
variables that have constant values when node is
encountered.
The pool at node B denoted by PB consists of a
single element (a,1) since the assignment a 1
must occur before B.

7
Constant Propagation (cont.)

Fundamental problem of constant propagation is to
determine the pool of constants for each node in
an arbitrary program graph.
By inspection of the program graph for the
example, the pool of constants at each node is
PA ? PB (a, 1) PC (a, 1) PD
(a, 1), (b, 2)
PE (a, 1), (b, 2), (d, 3) PF (a, 1), (b,
2), (d, 3)

8
Constant Propagation (cont.)

PN may be determined for each node N in the graph
as follows
? Consider each path (A, p1,p2, , pn,N). Apply
constant propagation along path to obtain set
of constants at node N.
? Intersection for each path to N is the set of
constants which can be assumed for optimization.
(It is unknown what path will be taken at
execution time, so intersection is conservative
choice)

9
Global Analysis Algorithm--Informal

Start with an entry node in the program graph,
along with a given entry pool corresponding to
this entry node.
Process the entry node and produce optimization
information for all immediate successors of the
entry node.
Intersect incoming optimizing pools with already
established pools at the successor nodes.
(First time node is encountered, assume incoming
pool is first approximation and continue
processing.)
for each successor, if amount of optimizing
information is reduced by this intersection, then
process successor like initial entry node.

10
Global Analysis Algorithm (cont)

It is useful to define an optimizing function f
which maps an input pool together with a
particular node to a new output pool.
Given a set of propagated constants, it is
possible to examine the operation of a particular
node and determine the set of constants that can
be assumed after the node is executed.
In the case of constant propagation, let V be a
set of variables, C be a set of constants, and N
be the set of nodes in the graph.
The set U V C represents ordered pairs which
may appear in any constant pool.
In fact, all constant pools are elements of the
power set U, denoted P(U).
Thus, f N P(U) ! P(U), where (v, c) 2 f(N, P)
if and only if
(cont.)

11
Global Analysis Algorithm (cont.)

1. (v, c) 2 P and the operation at node N
does not assign a new value to the variable v.
2. The operation at N assigns an
expression to the variable v, and the expression
evaluates to the constant c.

12
Constant Propagation (cont.)

Successively longer paths from A to D can be
evaluated, resulting in PD,3 , PD,4 , , PD,n for
arbitrarily large n.
The pool of constants that can be assumed no
matter what flow of control occurs is the set of
constants common to all PD,i , i.e.
Åi
PD,i
This procedure is not effective since the number
of such paths may have no finite bound, and the
procedure would not halt.

13
Optimization Function for Example

The optimizing function can be applied to node A
with an empty constant pool resulting in
f(A, ) (a,1).
The function can be applied to B with (a, 1) as
the constant pool yielding
f(B, (a, 1)) (a, 1),
(c, 0).

14
Extending f to Paths in the Graph

Given a path from entry node A to an arbitrary
node N, optimizing pool for path is determined by
composing the function f.
For example, f(C, f(B, f(A, ))) (a, 1), (c,
0), (b, 2) is the constant pool for D for this
path.

15
Constant Propagation (cont.)

The pool of propagated constants at node D can be
determined as follows
A path from entry node A to the node D is (A, B,
C, D). For this path the first approximation to
the pool for D is
PD,1 (a, 1), (b, 2), (c,
0).
A longer path from A to D is (A, B, C, D, E, F,
C, D) which results in the pool
PD,2 (a, 1), (b, 2), (c, 4),
(d, 3), (e, 2).

16
Computing the Pool of Optimizing Information.

The pool of optimizing information which can be
assumed at node N in the graph, independent of
the path taken at execution time, is
PN Å x x 2
FN.
Here FN f(pn, f(pn-1, , f(p1, P))) (p1,
p2, , pn, N) is a path from an entry node p1
with corresponding entry pool P to node N.

17
Directed Graphs and Paths

A finite directed graph G ltN,Egt is an arbitrary
finite set of nodes N and edges E ½ N N.
A path from node A to node B in G is a sequence
(p1, p2, , pk ) such that p1 A and pk B
where (pi, pi1) 2 E for 16 i lt k.
The length of the path is k 1.

18
Program Graphs

A program graph is a finite directed graph G with
a non-empty set of entry nodes I ½ N.
Given N 2 N we assume there exists a path (p1,
p2, , pn) such that p1 2 I and pn N.
(i.e., there is a path to every node in the graph
from an entry node.)

19
Successors and Predecessors of a Node

The set of immediate successors of a node N is
given by
I(N) N 2 N 9 (N,N) 2 E.
The set of immediate predecessors of N is given
by
I-1(N) N 2 N 9 (N, N) 2 E.

20
Meet-Semilatticies

Let the finite set L be the set of all possible
optimizing pools for a given application.
Let Æ be a meet operation with the properties
Æ L L ! L
x Æ y y Æ x
x Æ (y Æ z) (x Æ y) Æ z
where x, y z 2 L. The set L and the Æ operation
define a finite meet-semilattice.

21
Ordering on Meet-Semilattices

The Æ operation defines a partial ordering on L
by
x 6 y if and only if x Æ y x.
Similarly,
x lt y if and only if x 6y and x ? y.

22
Generalized Meet Operation

If X ½ L, the generalized meet operation Æ X is
defined as the pairwise application of Æ to the
elements of X.
L is assumed to have a zero element 0 such that
0 6 x for all x 2 L.
An augmented set L is constructed from L by
adding a unit element 1 such that 1 is not in L
and 1 Æ x x for all x in L.
The set L L 1. It follows that x lt1 for
all x in L.

23
Optimizing Function

An optimizing function f is defined
f N L ! L .
It must have the homomorphism property
F(N, x Æ y) f(N, x) Æ f(N, y) for all N 2 N and
x, y 2 L.
Note that f(N, x) lt 1 for all N 2 N and x 2 L.

24
Global Analysis Algorithm

Global analysis starts with an entry pool set EP
½ I L, where (e, x) 2 EP if e 2 I is an entry
node with optimizing pool x 2 L.
A1 initialize L EP.
A2 terminate ? If L then halt.
A3 select node Let L 2 L, L (N, Pi) for
some N 2 N and Pi 2 L.
Then L L L.
A4 Traverse Let PN be the current
approximate pool for node N
(Initially PN 1). If
PN 6 Pi the go to step A2.
A5 set pool PN PN Æ Pi, L L (N,
f(N, PN)) N 2 I(N).
A6 Loop Go to step A2.