Title: Code Generation I Finish SSA Form, Register Allocation
1Code Generation IFinish SSA Form, Register
Allocation
- EECS 483 Lecture 22
- University of Michigan
- Monday, December 1, 2003
2Class Problem From Last Time
Optimize this applying everything ?
r1 0 r2 0
r1 0 r2 0
r5 r7 3 r11 r5 r10 r11 9 r9 r1 r4
r9 4 r3 load(r4) r3 r3 r10 r12 r3 r3
r3 r10 r8 r2 r6 r8 ltlt 2 store(r6, r3) r13
r12 - 1 r1 r1 1 r2 r2 1
r5 r7 3 r10 r5 9 r4 r1 4 r3
load(r4) r12 r3 r10 r3 r12 r10 r6 r2 ltlt
2 store(r6, r3) r1 r1 1 r2 r2 1
apply forward/backward copy prop and dead code
elimination
store(r12, r2)
store(r12, r2)
3Class Problem From Last Time (2)
r1 0 r2 0
r1 0 r2 0 r5 r7 3 r10 r5 9 r100 r1
4 r101 r2 ltlt 2
Loop invariant code elim IV strength
reduction, copy propagation, dead code elimination
r5 r7 3 r10 r5 9 r4 r1 4 r3
load(r4) r12 r3 r10 r3 r12 r10 r6 r2 ltlt
2 store(r6, r3) r1 r1 1 r2 r2 1
r3 load(r100) r12 r3 r10 r3 r12
r10 store(r101, r3) r1 r1 1 r2 r2 1 r100
r100 4 r101 r101 4
store(r12, r2)
store(r12, r2)
4Class Problem From Last Time (3)
r1 0 r2 0 r5 r7 3 r10 r5 9 r100 r1
4 r101 r2 ltlt 2
r2 0 r5 r7 3 r10 r5 9 r100 0
constant prop constant folding IV
elimination dead code elim
r3 load(r100) r12 r3 r10 r3 r12
r10 store(r101, r3) r1 r1 1 r2 r2 1 r100
r100 4 r101 r101 4
r3 load(r100) r12 r3 r10 r3 r12
r10 store(r100, r3) r2 r2 1 r100 r100 4
store(r12, r2)
store(r12, r2)
5Where We Left Off ...Static Single Assignment
(SSA) Form
- Static single assignment
- Each assignment to a variable is given a unique
name - All of the uses reached by that assignment are
renamed - DU chains become obvious based on the register
name!
if ( ... ) x0 -1 else x1 5 x2
Phi(x0,x1) y x2
if ( ... ) x -1 else x 5 y x
6Dominator Tree and Dominance Frontiers
Dom tree
BB0
BB DF 0 - 1 - 2 7 3 7 4 6 5 6 6 7 7 1
BB0
BB1
BB1
BB2
BB3
BB2
BB3
BB4
BB5
BB4
BB5
BB6
BB6
BB7
BB7
Computing the dominance frontier For each join
point X in the CFG For each predecessor of X
in the CFG Run up to the IDOM(X) in the
dominator tree, adding X to DF(N) for
each N between X and IDOM(X)
7Class Problem
Draw the dominator tree, calculate the dominance
frontier for each BB
BB0
BB1
BB2
BB3
BB4
BB5
8Phi Node Insertion Algorithm
- Compute dominance frontiers
- Find global names (aka virtual registers)
- Global if name live on entry to some block
- For each name, build a list of blocks that define
it - Insert Phi nodes
- For each global name n
- For each BB b in which n is defined
- For each BB d in bs dominance frontier
- Insert a Phi node for n in d
- Add d to ns list of defining BBs
9Phi Node Insertion - Example
BB DF 0 - 1 - 2 7 3 7 4 6 5 6 6 7 7 1
a is defined in 0,1,3 need Phi in 7 then a is
defined in 7 need Phi in 1 b is defined in 0,
2, 6 need Phi in 7 then b is defined in 7
need Phi in 1 c is defined in 0,1,2,5 need
Phi in 6,7 then c is defined in 7 need Phi in
1 d is defined in 2,3,4 need Phi in 6,7 then
d is defined in 7 need Phi in 1 i is defined
in BB7 need Phi in BB1
a b c i
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d) i Phi(i,i)
BB0
a c
BB1
b c d
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d)
10Class Problem
Insert the Phi nodes
a b
BB0
BB1
c
b a
BB2
BB3
b
BB4
a c
BB5
11SSA Step 2 Renaming Variables
- Use an array of stacks, one stack per global
variable (VR) - Algorithm sketch
- For each BB b in a preorder traversal of the
dominator tree - Generate unique names for each Phi node
- Rewrite each operation in the BB
- Uses of global name current name from stack
- Defs of global name create and push new name
- Fill in Phi node parameters of successor blocks
- Recurse on bs children in the dominator tree
- lton exit from bgt pop names generated in b from
stacks
12Renaming Example (Initial State)
a b c i
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d) i Phi(i,i)
BB0
a c
BB1
var a b c d i ctr 0 0 0
0 0 stk a0 b0 c0 d0 i0
b c d
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d)
13Renaming Example (After BB0)
a0 b0 c0 i0
a Phi(a0,a) b Phi(b0,b) c Phi(c0,c) d
Phi(d0,d) i Phi(i0,i)
BB0
a c
BB1
var a b c d i ctr 1 1 1
1 1 stk a0 b0 c0 d0 i0
b c d
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d)
14Renaming Example (After BB1)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 3 2 3
2 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 c2
b c d
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a,a) b Phi(b,b) c Phi(c,c) d
Phi(d,d)
15Renaming Example (After BB2)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 3 3 4
3 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 b2 c2 d2
c3
b2 c3 d2
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a2,a) b Phi(b2,b) c Phi(c3,c) d
Phi(d2,d)
16Renaming Example (Before BB3)
This just updates the stack to remove the stuff
from the left path out of BB1
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 3 3 4
3 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 c2
b2 c3 d2
a d
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a2,a) b Phi(b2,b) c Phi(c3,c) d
Phi(d2,d)
17Renaming Example (After BB3)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 4 3 4
4 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 c2 d3 a3
b2 c3 d2
a3 d3
BB2
BB3
c
d
BB4
BB5
c Phi(c,c) d Phi(d,d)
b
BB6
i
BB7
a Phi(a2,a) b Phi(b2,b) c Phi(c3,c) d
Phi(d2,d)
18Renaming Example (After BB4)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 4 3 4
5 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 c2 d3 a3
d4
b2 c3 d2
a3 d3
BB2
BB3
c
d4
BB4
BB5
c Phi(c2,c) d Phi(d4,d)
b
BB6
i
BB7
a Phi(a2,a) b Phi(b2,b) c Phi(c3,c) d
Phi(d2,d)
19Renaming Example (After BB5)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 4 3 5
5 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 c2 d3 a3
c4
b2 c3 d2
a3 d3
BB2
BB3
c4
d4
BB4
BB5
c Phi(c2,c4) d Phi(d4,d3)
b
BB6
i
BB7
a Phi(a2,a) b Phi(b2,b) c Phi(c3,c) d
Phi(d2,d)
20Renaming Example (After BB6)
a0 b0 c0 i0
a1 Phi(a0,a) b1 Phi(b0,b) c1 Phi(c0,c) d1
Phi(d0,d) i1 Phi(i0,i)
BB0
a2 c2
BB1
var a b c d i ctr 4 4 6
6 2 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 b3 c2 d3 a3
c5 d5
b2 c3 d2
a3 d3
BB2
BB3
c4
d4
BB4
BB5
c5 Phi(c2,c4) d5 Phi(d4,d3)
b3
BB6
i
BB7
a Phi(a2,a3) b Phi(b2,b3) c Phi(c3,c5) d
Phi(d2,d5)
21Renaming Example (After BB7)
a0 b0 c0 i0
a1 Phi(a0,a4) b1 Phi(b0,b4) c1
Phi(c0,c6) d1 Phi(d0,d6) i1 Phi(i0,i2)
BB0
a2 c2
BB1
var a b c d i ctr 5 5 7
7 3 stk a0 b0 c0 d0 i0 a1 b1
c1 d1 i1 a2 b4 c2 d6 i2 a4
c6
b2 c3 d2
a3 d3
BB2
BB3
c4
d4
BB4
BB5
c5 Phi(c2,c4) d5 Phi(d4,d3)
b3
BB6
i2
BB7
a4 Phi(a2,a3) b4 Phi(b2,b3) c6
Phi(c3,c5) d6 Phi(d2,d5)
Fin!
22Class Problem
Rename the variables so this code is in SSA form
a b
BB0
BB1
c
b a
BB2
BB3
b
BB4
a c
BB5
23New Topic Register Allocation
- Through optimization, assume an infinite number
of virtual registers - Now, must allocate these infinite virtual
registers to a limited supply of hardware
registers - Want most frequently accessed variables in
registers - Speed, registers much faster than memory
- Direct access as an operand
- Any VR that cannot be mapped into a physical
register is said to be spilled - If there are not enough physical registers, which
virtual registers get spilled?
24Questions to Answer
- What is the minimum number of registers needed to
avoid spilling? - Given n registers, is spilling necessary?
- Find an assignment of virtual registers to
physical registers - If there are not enough physical registers, which
virtual registers get spilled?
25Live Range of a Virtual Register
- Value defn of a register
- Live range Set of operations
- 1 more or values connected by common uses
- A single VR may have several live ranges
- Live ranges are constructed by taking the
intersection of reaching defs and liveness - Initially, a live range consists of a single
definition and all ops in a function in which
that definition is live
26Example Constructing Live Ranges
Notation liveness, rdefs
1 x
Each definition is the seed of a live range. Ops
are added to the LR where both the defn
reaches and the variable is live
, 1
x, 1
2 x
3
x, 2
x, 1
4 x
, 1,2
5 x
LR1 for def 1 1,3,4 LR2 for def 2 2,4 LR3
for def 5 5,7,8 LR4 for def 6 6,7,8
, 5
, 5,6
x, 5
6 x
x, 6
7 x
x, 5,6
8 x
27Merging Live Ranges
- If 2 live ranges for the same VR overlap, they
must be merged to ensure correctness - LRs replaced by a new LR that is the union of the
LRs - Multiple defs reaching a common use
- Conservatively, all LRs for the same VR could be
merged - Makes LRs larger thanneed be, but done for
simplicity - We will not assume this!
1 r1
2 r1
r1 LR 1,2,3
3 r1
28Example Merging Live Ranges
Notation liveness, rdefs
LR1 for def 1 1,3,4 LR2 for def 2 2,4 LR3
for def 5 5,7,8 LR4 for def 6 6,7,8
1 x
, 1
x, 1
2 x
3
x, 2
x, 1
4 x
, 1,2
5 x
Merge LR1 and LR2, LR3 and LR4 LR5
1,2,3,4 LR6 5,6,7,8
, 5
, 5,6
x, 5
6 x
x, 6
7 x
x, 5,6
8 x
29Class Problem
- Compute the LRs
- for each def
- merge overlapping
1 y 2 x y
3 x
4 y 5 y
6 y 7 z
8 x 9 y
10 z
30Interference
- Two live ranges interfere if they share one or
more operations in common - Thus, they cannot occupy the same physical
register - Or a live value would be lost
- Interference graph
- Undirected graph where
- Nodes are live ranges
- There is an edge between 2 nodes if the live
ranges interfere - Whats not represented by this graph
- Extent of interference between the LRs
- Where in the program is the interference
31Example Interference Graph
lr(a) 1,2,3,4,5,6,7,8 lr(b) 2,3,4,6 lr(c)
1,2,3,4,5,6,7,8,9 lr(d) 4,5 lr(e)
5,7,8 lr(f) 6,7 lrg 8,9
1 a load() 2 b load()
3 c load() 4 d b c 5 e d - 3
6 f a b 7 e f c
a
b
c
d
8 g a e 9 store(g)
e
f
g
32Graph Coloring
- A graph is n-colorable if every node in the graph
can be colored with one of the n colors such that
2 adjacent nodes do not have the same color - Model register allocation as graph coloring
- Use the fewest colors (physical registers)
- Spilling is necessary if the graph is not
n-colorable where n is the number of physical
registers - Optimal graph coloring is NP-complete for n gt 2
- Use heuristics proposed by compiler developers
- Observation a node with degree lt n in the
interference graph can always be successfully
colored given its neighbors colors
33Coloring Algorithm (1)
- 1. While any node, x, has lt n neighbors
- Remove x and its edges from the graph
- Push x onto a stack
- 2. If the remaining graph is non-empty
- Compute cost of spilling each node (live range)
- For each reference to the register in the live
range - Cost (execution frequency spill cost)
- Let NB(x) number of neighbors of x
- Remove node x that has the smallest cost(x) /
NB(x) - Push x onto a stack (mark as spilled)
- Go back to step 1
34Coloring Algorithm (2)
- While stack is non-empty
- Pop x from the stack
- If xs neighbors are assigned fewer than R
colors, then assign x any unsigned color, else
leave x uncolored
35Example Finding Number of Colors
How many colors are needed to color this graph?
B
Try n1, no, cannot remove any nodes Try n2, no
again, cannot remove any nodes Try n3, Remove
B (since degree lt 3) Then can remove A, C Then
can remove D, E Thus it is 3-colorable
A
E
C
D