Title: ECE540S Optimizing Compilers
1ECE540SOptimizing Compilers
- http//www.eecg.toronto.edu/voss/ece540/
- March 11, 2002
2Register AllocationMuchnick, Chapter 16
3Register Allocation
- Weve been using pseudo registers so far
- assumed there was an infinite amount of them
- You need to allocate real space to them
- memory big but slow
- registers only a few but fast
- And, most RISCs only operate on registers
addr x 200 r1 addr x r2 addr a r3
r1 r2 addr y r3 r1 addr y r2 addr
x r3 r1 r2 addr z r3 addr x 10
int x, y , z x 200 y x a z y x x
10
r1 200 r2 r1 r4 r3 r2 r1 r1 10
4 Regs x ? r1 y ? r2 z ? r3 a ? r4
naive
4Register Allocation
- Register allocation improves code
- accessing faster memory
- fewer instructions
- But
- There are a limited number of machine registers
- Some registers can only hold certain types of
data - So, which variables to we allocate to registers?
- Register allocation is extremely important
- has huge impact on performance (just look a
previous slide) - must be done!
- its NP-Complete (not solvable in polynomial
time) - Use heuristics
5Approaches to Register Allocation
- Global Register Allocation Using Usage Counts
- Assume R registers are available. For each loop
nest, allocate registers to the R variables which
show the largest estimated benefit from being
kept in a register. Little or no cross nest
allocation is done. - Register Allocation by Graph Coloring
- our main focus and currently the most common
method - known about since 1971 but was impractical in
early compilers - Chaitin came up with 1st implementation in 1981
- Briggs proposed an optimistic extension to it
around 1989 - express overlap of the lifetimes of vars with an
interference graph - try to color this graph with R colors
- generate spill code when necessary to make the
graph R-colorable
6Global Register Allocation Using Usage Counts
- Look at loops or loop nests independently
- For each of these determine which of the
variables in it should be allocated to a register
based on - netsave(v,i) u usesave d defsave l
ldcost s stcost - where usesave is saving for using a value
in a register vs- memory - defsave is saving for defining a
value in a register vs- memory - ldcost is the execution-time cost
of a load instruction - stcost is the execution-time cost of a store
instruction - For a loop
- Allocate the R variables that would benefit the
most
7Global Register Allocation Using Usage Counts
L1 i i 1 j j i k
0 L2 k k j j j k if (
k lt n ) GOTO L2 if ( i lt n ) GOTO L1
netsave(i,1) 10 ? (2U 1D 1L 1S)
netsave(j,1) 10 ? (1U 1D 1L 1S)
netsave(k,1) 10 ? (0U 1D 0L 1S)
netsave(k,2) 102 ? (3U 1D 1L 1S)
netsave(j,2) 102 ? (2U 1D 1L 1S)
netsave(n,2) 102 ? (1U 0D 1L 0S)
netsave(i,3) 10 ? (1U 0D 1L 0S)
netsave(n,3) 10 ? (1U 0D 1L 0S)
Assuming all costs 1, i ? 10, j ? 100, k ?
200, n ? 0
If R 2, allocate j and k If R 3, allocate
i, j and k
8Global Register Allocation Using Usage Counts
L1 i i 1 j j i k
0 L2 k k j j j k if (
k lt n ) GOTO L2 if ( i lt n ) GOTO L1
netsave(i,1) 10 ? (2U 1D 0L 0S)
netsave(j,1) 10 ? (1U 1D 0L 0S)
netsave(k,1) 10 ? (0U 1D 0L 0S)
netsave(k,2) 102 ? (3U 1D 0L 0S)
netsave(j,2) 102 ? (2U 1D 0L 0S)
netsave(n,2) 102 ? (1U 0D 0L 0S)
netsave(i,3) 10 ? (1U 0D 0L 0S)
netsave(n,3) 10 ? (1U 0D 0L 0S)
Just load and store before and after the loop
nest. Assuming all costs 1, i ? 40, j ? 320,
k ? 410, n ? 110
If R 2, allocate j and k If R 3, allocate
n, j and k
9Register Allocation by Graph Coloring
- Observation We cannot allocate two values to the
same register if they are needed at the same time
at some point in the program. They are said to
interfere.
s1 to s6 are called the candidates. We wish to
allocate each in one of 3 machine registers. We
assume them no longer needed at end.
s1 2 s2 4 s3 s1 s2 s4 s1 1 s5 s1
s2 s6 s4 2
A register holding a variable whose value is no
longer needed may be de-allocated and allocated
to another variable.
10Register Allocation - Overview
- We encode interference information in a graph
called the interference graph. - It is an undirected graph in which each vertex is
a candidate or a machine register. - There is an edge between two vertices if the
corresponding two candidates interfere (i.e.,
they cannot occupy the same register), or if a
candidate cannot be allocated to a specific
machine register. Machine registers interfere
with one another
s1
s3
s5
r1
s2
s4
s6
r2
r3
11Register Allocation - Overview
- The register allocation problem is now to assign
each vertex in the graph a register such that no
two vertices that are connected by an edge are
assigned the same register. - This problem is an instance of a problem known as
Graph Coloring Given R colors, is it possible to
assign each vertex one color such that no two
connected vertices have the same color? - If yes, the graph is said to be R-colorable.
s1
s3
s5
r1
3-colorable?
s2
s4
s6
r2
r3
12Register Allocation - Overview
- The register allocation problem is now to assign
each vertex in the graph a register such that no
two vertices that are connected by an edge are
assigned the same register. - This problem is an instance of a problem known as
Graph Coloring Given R colors, is it possible to
assign each vertex one color such that no two
connected vertices have the same color? - If yes, the graph is said to be R-colorable.
3-colorable?
Yes!
13Register Allocation - Overview
- Assigning candidates to registers is now easy
once the graph has been colored.
r3 2 r2 4 r1 r3 r2 r1 r3 1 r3 r3
r2 r1 r1 2
14Register Allocation - Overview
- When an interference graph cannot be R-colored,
we cannot assign all candidates to registers. - We store some candidates in memory, load before
each use and store after each definition. - This is referred to as spilling.
- We select the smallest number of candidates whose
spilling to memory will make the graph
R-colorable. - We re-write the code to include spills and
rebuild the interference graph and color it.
15Register Allocation - Overview
- In our example, assume that we have only 2
registers. Hence, we want to find out if the
graph is 2-colorable.
s5
r1
s1
s1
s3
r2
s2
s4
s6
Not 2-colorable!
16Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 load s9 from
Mx s5 s9 s2 s6 s4 2
s4
Add edge between s4 and s5
Not 2-colorable!
17Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 store s4 to
My load s9 from Mx s5 s9 s2 load s10 from
My s6 s10 2
18Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 store s4 to
My load s9 from Mx s5 s9 s2 load s10 from
My s6 s10 2
It is 2-colorable!
- We can now generate register assignments and code.
19Register Allocation - Overview
- Hence, the steps involved in register allocation
to R registers are - Identify candidates.
- build interference graph.
- color the interference graph.
- if graph is R-colorable, the doneelse
- select a victim for spilling.
- re-write code.
- repeat.
- We will now examine each of these steps in some
detail. - Register allocation is iterative and is
time-consuming. - There are other complications (e.g.,
calling/return conventions, register windows,
etc) that we will ignore.
20Identifying Candidates
- Using variable names as candidates is not a good
approach - all uses of a name will be allocated to the same
register, but the name may be defined multiple
times, and hence takes different values. No
reason to hold in the same register! - Hence, we define the notion of webs as
equivalence classes for name uses the same name
in two different webs implies that the name takes
an independent value in each web. - webs separate the life-times of a variable.
- each web may be allocated in a register!
i 0 i i1 i 6
i i-2
21Webs (or Life-time Separation)
- A definition an all its reachable uses are in the
same web. - All definitions that reach the same use are in
the same web.
22Webs (or Life-time Separation)
- A definition an all it reachable uses are in the
same web. - All definitions that reach the same use are in
the same web.
23Life Ranges and Interference
x ...
BB 1
z ...
BB 2
BB 3
BB 4
LR(web 2) BB 2, BB 3)
z
x
y ...
BB 5
BB 6
LR(web 1) BB 1, BB 2, BB 3, BB 4, BB 5, BB
6, BB 7)
y
LR(web 3) BB 5, BB 6)
x
BB 7
24Life Ranges and Interference
- A set S of basic blocks is said to be convex if
BB a and BB b are in S and c is a BB on a path
from a to b, then c is in S. - The live range of a web is the minimal convex set
of instructions that include all the definitions
and uses in the web. - Intuitively, the live range of a web is the
region of BBs in which the web is live. - Two webs interfere if their life ranges
intersect. - Two webs that interfere must be allocated to
different registers. - The interference of the webs is captured using
the interference graph described earlier.
25Practical Note Intersection of Live Ranges
- Sufficient to include an arc if one of the nodes
is live at the definition of the other.
ENTRY
?
define a1 an
define b1 bn
?
use a1 an
use b1 bn
EXIT
26Graph Coloring
- Given a graph G(V,E), is it possible to assign
each vertex in the graph a color such that no two
adjacent vertices have the same color? - What is the smallest number of colors?
- Is the graph R-colorable?
- The problem is NP-complete for R ³ 3.
- However, there exists a good heuristic.
27Graph Coloring - Degree lt R rule
- Given a graph G(V,E) with a vertex v with degree
lt R. G is R-colorable if and only if the graph
G(V-v,E) is R-colorable. - If G is R-colorable, then adding the vertex v
will result in a colorable graph. We pick a color
for v that different from the colors of vertices
connected to v. We know we have enough colors
because D(v) lt R.
D(v) lt R
28Graph Coloring Heuristic
- Remove vertices v (and associated edges) with
D(v) lt R. - remove one at a time and push on a stack.
- When graph is empty, start to color
- pop a vertex from the stack.
- assign it a color different from vertices it is
connected to. - a color always exists!
29Graph Coloring Heuristic - Example
3-colorable?
30Graph Coloring - Degree lt R rule
3-colorable?
2-colorable?
- Neither is R-colorable using the degree lt R rule!
- Yet, they are both R-colorable!
31Graph Coloring - Heuristic (Take 2)
- Remove vertices v (and associated edges) with
D(v) lt R. - remove one at a time and push on a stack.
- When all remaining vertices v have D(v) ³ R,
- select a vertex to spill, mark accordingly.
- remove the vertex and push on the stack
- When graph is empty, start to color
- pop a vertex from the stack.
- assign it a color different from vertices it is
connected to. - a color always exists if Step 2 was not applied!
- otherwise, there may or may not be a color
available.
Step 2
32Graph Coloring - Example II
2-colorable?
33Graph Coloring - Example III
3-colorable?
a
b
e
d
c
34When no color is available?
- Re-generate the code, spilling a node (which
node?) - or split a candidate into multiple candidates
- Try to color the new code
- Continue to repeat this process until R-colorable
35Spilling
- How to select a web to spill to memory.
- One whose corresponding vertex v has D(v) ³ R.
- One with minimal spill cost.
- The spill cost is the cost of the extra loads and
stores used to retrieve and store the web to
memory.
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 load s9 from
Mx s5 s9 s2 s6 s4 2
s1 2 s2 4 s3 s1 s2 s4 s1 1 s5 s1
s2 s6 s4 2
36Spill Cost
- The spill cost is determined by the dynamic cost
of the extra loads and stores. - This is not possible to compute because
- we don not how often branches are taken in the
CFG, - we do not know how many times a loop iterates,
and - the dynamic cost may be input-dependent and will
vary from one execution to the next. - Hence, we statically estimate (reads approximate)
the spill cost based on the structure of the CFG. - loops play an important role.
- assume loops execute 10 times.
-
- may divide by vertex degree to favor vertex with
high degree.
37Graph Coloring - Heuristic (Take 3)
- Remove vertices v (and associated edges) with
D(v) lt R. - remove one at a time and push on a stack.
- When all remaining vertices v have D(v) ³ R,
- select a vertex to spill (smallest cost / D )
- remove the vertex and push on the stack
- When graph is empty, start to color
- pop a vertex from the stack.
- assign it a color different from vertices it is
connected to. - a color always exists if Step 2 was not applied!
- otherwise, there may or may not be a color
available.
Step 2
38Spill Cost - Example
two webs x and i
spill cost (x) 100 101 101 100
22
spill cost (i) 100 101 101 101 100
32
spill x to memory!
only one register! cdef cuse 1
39Splitting
- Break a web into multiple webs to reduce
interference in the interference graph. This is
referred to as splitting. - Insert instructions to spill value to memory and
load it from memory at point of split.
3
1
2
2-colorable?
40Representing the Interference Graph
- Usually represented in 2 forms
- Adjacency Matrix
- a lower triangular matrix such that if iltj,
AdjMatrixi,j true if the ith and jth values
are adjacent and is false otherwise. - allows you to quickly identify if 2 nodes are
adjacent - good for register coalescing
- Adjacency Lists
- an array of lists of adjacent nodes. Each
element is a record holding information about the
node, e.g. color chosen for the node, spill
location, spill cost, - used for graph coloring
- is easily built from the Adjacency Matrix
representation
41Register Allocation Overview
- Hence, the steps involved in register allocation
to R registers are - Identify candidates.
- build Adjacency Matrix representation of the
interference graph - do register coalescing
- build the Adjacency List representation of the
interference graph - color the interference graph.
- if graph is R-colorable, the doneelse
- select a victim for spilling.
- re-write code.
- repeat.
42Register Coalescing
- Eliminates copies from 1 register to another
- remove unnecessary copies from SSA
- remove moves to required register locations
- Search IR for sj si such that si and sj do not
interfere with each other or neither si nor sj
are stored to between the copy assignment and the
end of the routine. - find instructions that wrote si and replace si
with sj. - update the interference graph
- anything that interfered with si or sj now
interferes with sj - however, if using definitions to determine
interference, definition of sj at sj si is now
gone. May remove some interference.