Title: Data Flow Analysis 3 15-411 Compiler Design
1Data Flow Analysis 315-411 Compiler Design
Nov. 8, 2005
2Key Reference on Global Optimization
- Gary A. Kildall, A Unified Approach to Global
Program Optimization, ACM Symposium on Principles
of Programming Languages, 1973, pages 194-206. - From the abstract
- A technique is presented for global analysis of
object code generated for expressions. The global
expression optimization presented includes
constant propagation, common sub-expression
elimination, elimination of redundant register
load operations and live expression analysis. A
general purpose program flow analysis algorithm
is developed which depends on an optimizing
function. The algorithm is defined formally using
a directed graph model of program flow structure
and is shown to be correct.
3Kildalls Contribution
- A number of techniques had been developed for
compile-time optimization to - ? locate redundant computations,
- ? perform constant computations,
- ? reduce the number of store-load
sequences, etc. - Some provided analysis of only straight-line
sequences of instructions others tried to take
program branching into account. - Kildall gave a single unified flow analysis
algorithm which extended all the straight-line
techniques to include branching. - He stated the algorithm formally and proved it
correct in his POPL paper.
4Constant Propagation Example program
- begin
- integer i, a, b, c, d, e
- a 1 c0
- for i 1 step 1 until 10 do
- begin b 2
- d a b
- e b c
- c 4
- end
- end
5Directed Graph Representation
Nodes represent sequences of instructions with no
branches. Edges represent control flow between
nodes.
6 Constant Propagation
- Convenient to associate a pool of propagated
constants with each node in the graph. - Pool is a set of ordered pairs which indicate
variables that have constant values when node is
encountered. - The pool at node B denoted by PB consists of a
single element (a,1) since the assignment a 1
must occur before B.
7Constant Propagation (cont.)
- Fundamental problem of constant propagation is to
determine the pool of constants for each node in
an arbitrary program graph. - By inspection of the program graph for the
example, the pool of constants at each node is - PA ? PB (a, 1) PC (a, 1) PD
(a, 1), (b, 2) - PE (a, 1), (b, 2), (d, 3) PF (a, 1), (b,
2), (d, 3)
8Constant Propagation (cont.)
- PN may be determined for each node N in the graph
as follows - ? Consider each path (A, p1,p2, , pn,N). Apply
constant propagation along path to obtain set
of constants at node N. - ? Intersection for each path to N is the set of
constants which can be assumed for optimization. - (It is unknown what path will be taken at
execution time, so intersection is conservative
choice)
9Global Analysis Algorithm--Informal
- Start with an entry node in the program graph,
along with a given entry pool corresponding to
this entry node. - Process the entry node and produce optimization
information for all immediate successors of the
entry node. - Intersect incoming optimizing pools with already
established pools at the successor nodes. - (First time node is encountered, assume incoming
pool is first approximation and continue
processing.) - for each successor, if amount of optimizing
information is reduced by this intersection, then
process successor like initial entry node.
10Global Analysis Algorithm (cont)
- It is useful to define an optimizing function f
which maps an input pool together with a
particular node to a new output pool. - Given a set of propagated constants, it is
possible to examine the operation of a particular
node and determine the set of constants that can
be assumed after the node is executed. - In the case of constant propagation, let V be a
set of variables, C be a set of constants, and N
be the set of nodes in the graph. - The set U V C represents ordered pairs which
may appear in any constant pool. - In fact, all constant pools are elements of the
power set U, denoted P(U). - Thus, f N P(U) ! P(U), where (v, c) 2 f(N, P)
if and only if -
(cont.) -
-
11Global Analysis Algorithm (cont.)
- 1. (v, c) 2 P and the operation at node N
does not assign a new value to the variable v. - 2. The operation at N assigns an
expression to the variable v, and the expression
evaluates to the constant c.
12Constant Propagation (cont.)
- Successively longer paths from A to D can be
evaluated, resulting in PD,3 , PD,4 , , PD,n for
arbitrarily large n. - The pool of constants that can be assumed no
matter what flow of control occurs is the set of
constants common to all PD,i , i.e.
- Åi
PD,i - This procedure is not effective since the number
of such paths may have no finite bound, and the
procedure would not halt.
13Optimization Function for Example
- The optimizing function can be applied to node A
with an empty constant pool resulting in - f(A, ) (a,1).
- The function can be applied to B with (a, 1) as
the constant pool yielding - f(B, (a, 1)) (a, 1),
(c, 0).
14Extending f to Paths in the Graph
- Given a path from entry node A to an arbitrary
node N, optimizing pool for path is determined by
composing the function f. - For example, f(C, f(B, f(A, ))) (a, 1), (c,
0), (b, 2) is the constant pool for D for this
path.
15Constant Propagation (cont.)
- The pool of propagated constants at node D can be
determined as follows - A path from entry node A to the node D is (A, B,
C, D). For this path the first approximation to
the pool for D is - PD,1 (a, 1), (b, 2), (c,
0). - A longer path from A to D is (A, B, C, D, E, F,
C, D) which results in the pool - PD,2 (a, 1), (b, 2), (c, 4),
(d, 3), (e, 2).
16Computing the Pool of Optimizing Information.
- The pool of optimizing information which can be
assumed at node N in the graph, independent of
the path taken at execution time, is - PN Å x x 2
FN. - Here FN f(pn, f(pn-1, , f(p1, P))) (p1,
p2, , pn, N) is a path from an entry node p1
with corresponding entry pool P to node N.
17Directed Graphs and Paths
- A finite directed graph G ltN,Egt is an arbitrary
finite set of nodes N and edges E ½ N N. - A path from node A to node B in G is a sequence
(p1, p2, , pk ) such that p1 A and pk B
where (pi, pi1) 2 E for 16 i lt k. - The length of the path is k 1.
18Program Graphs
- A program graph is a finite directed graph G with
a non-empty set of entry nodes I ½ N. - Given N 2 N we assume there exists a path (p1,
p2, , pn) such that p1 2 I and pn N. - (i.e., there is a path to every node in the graph
from an entry node.)
19Successors and Predecessors of a Node
- The set of immediate successors of a node N is
given by - I(N) N 2 N 9 (N,N) 2 E.
- The set of immediate predecessors of N is given
by - I-1(N) N 2 N 9 (N, N) 2 E.
20 Meet-Semilatticies
- Let the finite set L be the set of all possible
optimizing pools for a given application. - Let Æ be a meet operation with the properties
- Æ L L ! L
- x Æ y y Æ x
- x Æ (y Æ z) (x Æ y) Æ z
- where x, y z 2 L. The set L and the Æ operation
define a finite meet-semilattice.
21Ordering on Meet-Semilattices
- The Æ operation defines a partial ordering on L
by - x 6 y if and only if x Æ y x.
- Similarly,
- x lt y if and only if x 6y and x ? y.
22Generalized Meet Operation
- If X ½ L, the generalized meet operation Æ X is
defined as the pairwise application of Æ to the
elements of X. - L is assumed to have a zero element 0 such that
0 6 x for all x 2 L. - An augmented set L is constructed from L by
adding a unit element 1 such that 1 is not in L
and 1 Æ x x for all x in L. - The set L L 1. It follows that x lt1 for
all x in L. -
23 Optimizing Function
- An optimizing function f is defined
- f N L ! L .
- It must have the homomorphism property
- F(N, x Æ y) f(N, x) Æ f(N, y) for all N 2 N and
x, y 2 L. - Note that f(N, x) lt 1 for all N 2 N and x 2 L.
24Global Analysis Algorithm
- Global analysis starts with an entry pool set EP
½ I L, where (e, x) 2 EP if e 2 I is an entry
node with optimizing pool x 2 L. - A1 initialize L EP.
- A2 terminate ? If L then halt.
- A3 select node Let L 2 L, L (N, Pi) for
some N 2 N and Pi 2 L. - Then L L L.
- A4 Traverse Let PN be the current
approximate pool for node N - (Initially PN 1). If
PN 6 Pi the go to step A2. - A5 set pool PN PN Æ Pi, L L (N,
f(N, PN)) N 2 I(N). - A6 Loop Go to step A2.