Title: Code Generation II Register Allocation Exam 2 Review
1Code Generation IIRegister AllocationExam 2
Review
- EECS 483 Lecture 23
- University of Michigan
- Wednesday, December 3, 2003
2Graph Coloring
- A graph is n-colorable if every node in the graph
can be colored with one of the n colors such that
2 adjacent nodes do not have the same color - Model register allocation as graph coloring
- Use the fewest colors (physical registers)
- Spilling is necessary if the graph is not
n-colorable where n is the number of physical
registers - Optimal graph coloring is NP-complete for n gt 2
- Use heuristics proposed by compiler developers
- Observation a node with degree lt n in the
interference graph can always be successfully
colored given its neighbors colors
3Example Finding Number of Colors
How many colors are needed to color this graph?
B
Try n1, no, cannot remove any nodes Try n2, no
again, cannot remove any nodes Try n3, Remove
B (since degree lt 3) Then can remove A, C Then
can remove D, E Thus it is 3-colorable
A
E
C
D
4Coloring Algorithm (1)
- 1. While any node, x, has lt n neighbors
- Remove x and its edges from the graph
- Push x onto a stack
- 2. If the remaining graph is non-empty
- Compute cost of spilling each node (live range)
- For each reference to the register in the live
range - Cost (execution frequency spill cost)
- Let NB(x) number of neighbors of x
- Remove node x that has the smallest cost(x) /
NB(x) - Push x onto a stack (mark as spilled)
- Go back to step 1
5Coloring Algorithm (2)
- While stack is non-empty
- Pop x from the stack
- If xs neighbors are assigned fewer than R
colors, then assign x any unsigned color, else
leave x uncolored
6Example Do a 3-Coloring
lr(a) 1,2,3,4,5,6,7,8 refs(a)
1,6,8 lr(b) 2,3,4,6 refs(b)
2,4,6 lr(c) 1,2,3,4,5,6,7,8,9 refs(c)
3,4,7 lr(d) 4,5 refs(d) 4,5 lr(e)
5,7,8 refs(e) 5,7,8 lr(f) 6,7 refs(f)
6,7 lrg 8,9 refs(g) 8,9
a
b
Profile frequencies of operations 1,2 100 3,4,5
75 6,7 25 8,9 100 Assume each spill
requires 1 operation
c
d
e
f
g
cost sum of frequencies of references to
variable cost(a) Sum(refs(a)) Sum(1,6,8)
100 25 100 225
a b c d e f g cost 225 200 175 150 200 50 200 n
eighbors 6 4 5 4 3 4 2 cost/n 37.5 50 35 37.5 66
.7 12.5 100
7Example Do a 3-Coloring (2)
Remove all nodes lt 3 neighbors So, g can be
removed
Stack g
a
b
a
b
c
d
c
d
e
e
f
f
g
8Example Do a 3-Coloring (3)
Now must spill a node Choose one with the
smallest cost/NB ? f is chosen
Stack f (spilled) g
a
b
a
b
c
d
c
d
e
e
f
9Example Do a 3-Coloring (4)
Remove all nodes lt 3 neighbors So, e can be
removed
Stack e f (spilled) g
a
b
a
b
c
d
c
d
e
10Example Do a 3-Coloring (5)
Now must spill another node Choose one with the
smallest cost/NB ? c is chosen
Stack c (spilled) e f (spilled) g
a
b
a
b
d
c
d
11Example Do a 3-Coloring (6)
Remove all nodes lt 3 neighbors So, a, b, d can
be removed
Stack d b a c (spilled) e f (spilled) g
a
b
Null
d
12Example Do a 3-Coloring (7)
Stack d b a c (spilled) e f (spilled) g
a
b
c
d
e
f
g
Have 3 colors red, green, blue, pop off the
stack assigning colors only consider conflicts
with non-spilled nodes already popped off stack d
? red b ? green (cannot choose red) a ? blue
(cannot choose red or green) c ? no color
(spilled) e ? green (cannot choose red or blue) f
? no color (spilled) g ? red (cannot choose blue)
13Example Do a 3-Coloring (8)
d ? red b ? green a ? blue c ? no color e ?
green f ? no color g ? red
1 blue load() 2 green load()
3 spill1 load() 4 red green spill1 5
green red - 3
6 spill2 blue green 7 green spill2
spill1
Notes no spills in the blocks executed 100
times. Most spills in the block executed 25
times. Longest lifetime (c) also spilled
8 red blue green 9 store(red)
14Class Problem
do a 2-coloring compute cost matrix draw
interference graph color graph
1 y 2 x y
1
3 x
10
90
4 y 5 y
6 y 7 z
8 x 9 y
99
1
10 z
15Caller/Callee Save Preference
- Processors generally divide regs, ½ caller, ½
callee - Caller/callee save is a programming convention
- Not part of architecture or microarchitecture
- When you are assigning colors, need to choose
caller/callee - Using a register may have save/restore overhead
16Caller/Callee Cost Calculation
- Caller save/restore cost
- For each subroutine call a live range spans
- Cost (save_cost restore_cost) frequency
- Variable not live across a subroutine call, has 0
caller cost - Leaf routines are ideal place to use caller save
registers! - Callee save/restore cost
- Cost (save_cost restore_cost)
procedure_entry_freq - When subroutine calls are in the live range,
callee usually better - Compare these costs with just spilling the
variable - If cheaper to spill, then just spill it
17Alternate Priority Scheme
- Chaitin priority (what we have used)
- priority spill cost / number of neighbors
- Hennessy and Chow priority
- priority spill cost / size of live range
- Intuition
- Small live ranges with high spill cost are ideal
candidates to be allocated a register - As the size of a live range grows, it becomes
less attractive for register allocation - Ties up a register for a long time!
18Exam 2 Review
- Note you should consider the HW 1 problems as a
good set of practice problems in addition to
those in these slides
19Logistics
- When, Where
- Next Monday, Dec 8, 1040am 1230pm
- 1005 EECS
- Type
- Open book/note
- What to bring
- Text book, reference books, lecture notes
- Pencils
- No laptops or cell phones
20Topics Covered (1)
- Intermediate representation
- Translating high-level constructs to assembly
- Storage management
- Stack frame
- Data layout
- Control flow analysis
- CFGs, dominator / post dominator analysis,
immediate dominator, dominator tree, dominance
frontier - Loop detection, trip count, induction variables
21Topics Covered (2)
- Dataflow analysis/SSA form
- GEN, KILL, IN, OUT, up/down, all/any paths
- Liveness, reaching defs, DU, available exprs, ...
- Phi functions, putting code into SSA form
- Optimization
- Control
- Loop unrolling, acyclic optimizations (branch to
branch, unreachable code elim, etc.) - Data
- Local, global, loop optis
- Register allocation Graph Coloring
22Not Covered
- This is NOT a cumulative test
- Exam 1 ? frontend, This exam ? backend
- No parsing or type analysis
- But some earlier topics that carry over you will
be expected to be familiar with (ie AST) - No MIRV specific stuff
- But concepts of projects 3/4 are fair game
23Textbook
- What have we covered Nominally Chs 7-10
- 2nd half of class more loosely followed book
- Things you should know / can ignore
- Ch 7 7.2,7.3 is what we covered, ignore rest
- Ch 8 8.1-8.5 is what we covered, but not that
closely - Ch 9 we covered 9.3, 9.4, 9.7, 9.9
- Ignore all code generation from DAG stuff
- Ch 10 This is the most important of all the
book chapters - Ignore all of 10.8, interval graph stuff in
10.9, all of 10.10, 10.12, 10.13
24Exam Format
- Similar to Exam 1
- Short answer 33
- Explain something
- Short problems to work out
- Longer design problems 66
- E.g., compute reaching defn gen/kill/in/out sets
- Range of questions
- Simple Were you conscience in class?
- Grind it out Can you solve problems
- Challenging How well do you really understand
things
25A Few Sample Problems Stack Frames
void foo(int a, int b) static int c
int d2 ... bar(c, d0) ...
Draw the stack frame for foo() at the point right
before it calls bar(). You should assume all
parameters and the return addr are passed on the
stack. Further assume 2 physical registers are
used in foo, both are callee-save and there are
no spilled registers.
26Optimization
i 4 while (i lt 25) ptr Ai
ptr ptr-gtnext
Of the 3 loop unrolling types discussed in class,
which techniques can be used to unroll this loop?
Explain your answer. Unroll it 3x using the
most effective form that is legal.
27SSA Form
a 0 b 0
BB0
b a a b
BB1
a a
b b a b
BB2
BB3
b a
BB4
a b
BB5
Put the following code into SSA form. Note dont
try to optimize the code or get rid of
instructions that may look trivial
28Register Allocation
What is the minimum number of colors required to
color the above interference graph such that no
spills are necessary
29Dominators/Loop Detection
Entry
HW problem 1 with edge from BB5 ? BB7 Compute DOM
sets Identify all natural loops
BB1
BB2
BB3
BB4
BB8
BB5
BB6
BB7
BB9
Exit
30Dominators/Loop Detection
Entry
HW problem 1 without edge from BB5 ? BB7 Compute
DOM sets Identify all natural loops
BB1
BB2
BB3
BB4
BB8
BB5
BB6
BB7
BB9
Exit