Title: Introduction to Scalar Flow Analysis
1Introduction to Scalar Flow Analysis
- Basic Blocks
- Basic Block Optimizations
- (Control) Flow Graph
2Why Flow Analysis?
if ( a 3 ) go to next b a d x 6 a
3 next b ad
- How can the compiler determine whether a will
always be 3 in last statement? - Do we really need to recompute the value of b?
- Is x ever used or could we delete this assignment?
3Why Flow Analysis?
- Get precise information on the usage of
variables and availability of values throughout a
procedure - Precise we can be sure that the facts it
provides are correct. - It may not find out everything that is true and
relevant. - Then we can decide if we can replace a with the
value 3, remove the code that recomputes b or the
assignment to x
Scalar we do not investigate details of array
element usage
4Flow Analysis
- Control flow analysis
- discover hierarchical flow of control through a
procedure - Data flow analysis
- discover how data is manipulated throughout a
procedure
Intraprocedural analysis. We look at
interprocedural analysis much later. It is a good
deal harder.
5Call Graph
6Control Flow Graph
7Example
What variables can we analyze?
- program main
- real a ( 5 ), b, c
- integer i1, i2
- common i1, i2
- .
- a ( i1 ) 1
- call sub ( b,c )
- i1 a ( i2 1 ) - 1
- call sub ( c,b )
- end
- subroutine sub ( f1, f2 )
- real f1, f2
- integer i1, i2, i3, i4
- common i1, i2
- .
- i3
- i2 i3 2
- .
- end
8Basic Blocks
A basic block is a sequence of consecutive
intermediate language statements in which flow
of control can only enter at the beginning and
leave at the end.
- Only the last statement of a basic block can be a
branch statement. - Only the first statement of a basic block can be
a target of a branch. - Procedure calls may occur within a basic block.
9Finding Basic Blocks
1. Identify leaders (i.e. the first statements
of basic blocks) by using the following rules
(i) The first statement in the program is a leader
(ii) Any statement that is the target of a branch
statement is a leader (for most
intermediate languages these are statements
with an associated label)
(iii) Any statement that immediately follows a
branch or return statement is a leader
10Example Finding Leaders
The following code computes the inner product of
two vectors.
begin prod 0 i 1 do begin
prod prod ai bi i i 1
end while i lt 20 end
Source code
11Example Finding Leaders
The following code computes the inner product of
two vectors.
(1) prod 0 (2) i 1 (3) t1 4
i (4) t2 at1 (5) t3 4 i (6) t4
bt3 (7) t5 t2 t4 (8) t6 prod
t5 (9) prod t6 (10) t7 i 1 (11) i
t7 (12) if i lt 20 goto 3 (13)
Rule (i)
begin prod 0 i 1 do begin
prod prod ai bi i i 1
end while i lt 20 end
Source code
Three-address code
12Example Finding Leaders
The following code computes the inner product of
two vectors.
(1) prod 0 (2) i 1 (3) t1 4
i (4) t2 at1 (5) t3 4 i (6) t4
bt3 (7) t5 t2 t4 (8) t6 prod
t5 (9) prod t6 (10) t7 i 1 (11) i
t7 (12) if i lt 20 goto 3 (13)
Rule (i)
begin prod 0 i 1 do begin
prod prod ai bi i i 1
end while i lt 20 end
Rule (ii)
Source code
Three-address code
13Example Finding Leaders
The following code computes the inner product of
two vectors.
(1) prod 0 (2) i 1 (3) t1 4
i (4) t2 at1 (5) t3 4 i (6) t4
bt3 (7) t5 t2 t4 (8) t6 prod
t5 (9) prod t6 (10) t7 i 1 (11) i
t7 (12) if i lt 20 goto 3 (13)
Rule (i)
begin prod 0 i 1 do begin
prod prod ai bi i i 1
end while i lt 20 end
Rule (ii)
Source code
Rule (iii)
Three-address code
14Forming the Basic Blocks
Now that we know the leaders, how do we form the
basic blocks associated with each leader?
2. The basic block corresponding to a leader
consists of the leader, plus all statements up to
but not including the next leader or up to the
end of the program.
15Example Forming the Basic Blocks
(1) prod 0 (2) i 1
B1
(3) t1 4 i (4) t2 at1 (5) t3 4
i (6) t4 bt3 (7) t5 t2 t4 (8)
t6 prod t5 (9) prod t6 (10) t7 i
1 (11) i t7 (12) if i lt 20 goto 3
B2
Basic Blocks
(13)
B3
16Examples of Transformations
Common subexpression elimination within a block
Goal eliminate redundant (multiple)
computations. We can eliminate the computation
of an expression if an equivalent expression has
been computed and is available
17Common Subexpression Elimination
- Two expressions are equivalent only if they
produce the same result. - This is true if they are identical and none of
the operands are redefined in the intervening
code.
t3 a t2 t4 t3 t1 t5 t4 b t6 t3
t1 t7 t6 b c t5 t7
t3 a t2 t4 t3 t1 t5 t4 b t6 t4 t7
t6 b c t5 t7
18Copy Propagation
- Goal save memory (and perhaps enable other
optimizations) by removing equivalent variables. - If statement a b appears in code, then a
and b are equivalent. References to a may be
replaced by references to b, at least until the
next assignment to a. If we can remove all
references to a, it is dead and can be
eliminated.
t5 t4 b t6 t4 t7 t6 b c t5 t7
t5 t4 b t6 t4 t7 t4 b c t5 t7
We can now reapply common subexpression
elimination!
19Examples of Transformations
Dead code elimination
if x is never referenced after the statement x
yz, the statement can be safely eliminated.
We can apply this in the example on the previous
slide to eliminate the assignment to t6, since t6
is no longer used.
t5 t4 b t6 t4 t7 t4 b c t5 t7
t5 t4 b t7 t4 b c t5 t7
t5 t4 b t7 t5 c t5 t7
20Constant Folding
- Goal save execution time by performing simple
arithmetic computations at compile time.
t1 2 t2 t1 / 2 t3 a t2 t4 t3 t1
t1 4 2 t2 t1 / 2 t3 a t2 t4 t3 t1
21Strength Reduction
- Goal save time by performing faster arithmetic
operations than those originally specified. - Multiplication and division are much slower than
addition and subtraction. - It may be worthwhile to replace the former by an
alternative series of computations using only the
latter.
x x 0 x x 1 x y2 z 2x
x yy z x x
22Examples of Transformations
Interchange of statements
Renaming temporary variables
if there is a statement t b c, we can change
it to u b c and change all uses of t to u.
This may help us to reuse and thus save space.
23Control-flow Graph
- Basic blocks are usually short
- It is more powerful to optimize procedure
- Model the transfer of control in the procedure
- Nodes in the graph are basic blocks
- Edges in the graph represent control flow
- Example
24Transformations on Basic Blocks
- Structure-Preserving Transformations
- Common subexpression elimination (13.1)
- Unreachable, dead code elimination (18.1,10)
- Constant propagation (12.6)
- Code straightening (18.2)
- Code hoisting (13.5)
- Statement interchange
25Flow Analysis
- Control Flow Analysis determine control
structure of a program and build Control Flow
Graph - Data Flow Analysis determine the flow of scalar
values and build Data Flow Graph - Solution for the Flow Analysis Problem propagate
data flow information along flow graph.
26Flowgraph
- Flowgraph G ( N, E, s), where
- N is the set of basic blocks
- E is the set of directed edges between nodes in
N - s is the initial node. There is a path from s to
every node in N - Some compilers used extended basic blocks.
- Flowgraph represents control structure of
procedure
27Dominators
n dominates m iff every path from s to m contains
n
- n dominates n and m n lt n, n lt n and n
lt m - n directly (immediately) dominates n but not m
- n lt m
281
1
Flow graph
Dominance tree
2
2
3
a
3
4
4
9
5
5
6
7
6
7
8
8
a
9
29Example
- unsigned int fib(m)
- unsigned int m
- unsigned int f0 0 , f1 1 , f2 , i
- if ( m lt 1 ) return m
- else for ( i 2 i lt m i )
- f2 f0 f1
- f0 f1 f1 f2
- return f2
30Example
- unsigned int fib(m)
- unsigned int m
- unsigned int f0 0 , f1 1 , f2 , i 1
- f2 m
- loop if ( f2 lt i ) return f2
- i f2 f0 f1 f0 f1
f1 f2 - goto loop
31Loops
- A subflowgraph G ( N, E,s) is a loop with
entry point s iff - for each (n,m) in E and m in N, then either n
in N or m s - for each pair of nodes n, m in N, there are
non-trivial paths from n to m and from m to n -
- The first condition ensures a single entry point.
The second condition ensures the graph is
strongly connected and non-trivial.
32Loop Entry Points
We can use the dominance relationship to identify
all the loops in a procedure
- A backward edge is one whose direction is
inverse to the direction of dominance - So it goes from a node to a dominator of that
node - A loop containing a backward edge to the node
that is the entry point. - In fact, this characterizes loop entry points
A node s in N is the entry point of a loop in G
iff there is a node n in N such that (n,s)
in E and s lt n (backward
edge)
33Loop Entry Points
- Once we have the entry point, we can identify the
nodes in the loop - The maximum loop with entry point s is
- the subflowgraph G with initial node s that is
generated by the set N - N m in N there is a path from m to s
that only contains nodes dominated by s
34Algorithm to Find Loops
- First step
- compute the dominance relation for flowgraph.
- Second step
- find set of nodes s in flowgraph for which there
is a node n such that - s lt n in dominance relationship and
- (n,s) is an edge in flowgraph.
Each such node s is a loop entry point and (n,s)
is a backward edge
35Algorithm to Find Loops
- Third (construction) step
- construct natural loop associated with s and
(n,s). It is characterized by its nodes N N - Enter entry point s and node n at the other end
of the backward edge into N. - Then include the predecessors of n.
- Continue by adding all the predecessors of nodes
that are in the loop, except for the predecessors
of the entry point, until there are no more such
nodes.
36Loop Example
B1
1. StmtA 2. I 1 3. if ( I gt N) goto 7 4.
StmtB 5. I I 1 6. goto 3 7. StmtC
B1
B1 B2 B3 B4
B2
B2
B3
Graph has a backward edge (B3,B2) B2 is entry
point
Dominator tree
Flowgraph
B4