ECE540S Optimizing Compilers

About This Presentation

Title:

ECE540S Optimizing Compilers

Description:

Register allocation is extremely important ... defsave is saving for defining a value in a register vs- memory ... Register Allocation by Graph Coloring ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 43

Provided by: Michae7

Category:

more less

Transcript and Presenter's Notes

Title: ECE540S Optimizing Compilers

1
ECE540SOptimizing Compilers

http//www.eecg.toronto.edu/voss/ece540/
March 11, 2002

2
Register AllocationMuchnick, Chapter 16
3
Register Allocation

Weve been using pseudo registers so far
assumed there was an infinite amount of them
You need to allocate real space to them
memory big but slow
registers only a few but fast
And, most RISCs only operate on registers

addr x 200 r1 addr x r2 addr a r3
r1 r2 addr y r3 r1 addr y r2 addr
x r3 r1 r2 addr z r3 addr x 10
int x, y , z x 200 y x a z y x x
10
r1 200 r2 r1 r4 r3 r2 r1 r1 10
4 Regs x ? r1 y ? r2 z ? r3 a ? r4
naive
4
Register Allocation

Register allocation improves code
accessing faster memory
fewer instructions
But
There are a limited number of machine registers
Some registers can only hold certain types of
data
So, which variables to we allocate to registers?
Register allocation is extremely important
has huge impact on performance (just look a
previous slide)
must be done!
its NP-Complete (not solvable in polynomial
time)
Use heuristics

5
Approaches to Register Allocation

Global Register Allocation Using Usage Counts
Assume R registers are available. For each loop
nest, allocate registers to the R variables which
show the largest estimated benefit from being
kept in a register. Little or no cross nest
allocation is done.
Register Allocation by Graph Coloring
our main focus and currently the most common
method
known about since 1971 but was impractical in
early compilers
Chaitin came up with 1st implementation in 1981
Briggs proposed an optimistic extension to it
around 1989
express overlap of the lifetimes of vars with an
interference graph
try to color this graph with R colors
generate spill code when necessary to make the
graph R-colorable

6
Global Register Allocation Using Usage Counts

Look at loops or loop nests independently
For each of these determine which of the
variables in it should be allocated to a register
based on
netsave(v,i) u usesave d defsave l
ldcost s stcost
where usesave is saving for using a value
in a register vs- memory
defsave is saving for defining a
value in a register vs- memory
ldcost is the execution-time cost
of a load instruction
stcost is the execution-time cost of a store
instruction
For a loop
Allocate the R variables that would benefit the
most

7
Global Register Allocation Using Usage Counts
L1 i i 1 j j i k
0 L2 k k j j j k if (
k lt n ) GOTO L2 if ( i lt n ) GOTO L1
netsave(i,1) 10 ? (2U 1D 1L 1S)
netsave(j,1) 10 ? (1U 1D 1L 1S)
netsave(k,1) 10 ? (0U 1D 0L 1S)
netsave(k,2) 102 ? (3U 1D 1L 1S)
netsave(j,2) 102 ? (2U 1D 1L 1S)
netsave(n,2) 102 ? (1U 0D 1L 0S)
netsave(i,3) 10 ? (1U 0D 1L 0S)
netsave(n,3) 10 ? (1U 0D 1L 0S)
Assuming all costs 1, i ? 10, j ? 100, k ?
200, n ? 0
If R 2, allocate j and k If R 3, allocate
i, j and k
8
Global Register Allocation Using Usage Counts
L1 i i 1 j j i k
0 L2 k k j j j k if (
k lt n ) GOTO L2 if ( i lt n ) GOTO L1
netsave(i,1) 10 ? (2U 1D 0L 0S)
netsave(j,1) 10 ? (1U 1D 0L 0S)
netsave(k,1) 10 ? (0U 1D 0L 0S)
netsave(k,2) 102 ? (3U 1D 0L 0S)
netsave(j,2) 102 ? (2U 1D 0L 0S)
netsave(n,2) 102 ? (1U 0D 0L 0S)
netsave(i,3) 10 ? (1U 0D 0L 0S)
netsave(n,3) 10 ? (1U 0D 0L 0S)
Just load and store before and after the loop
nest. Assuming all costs 1, i ? 40, j ? 320,
k ? 410, n ? 110
If R 2, allocate j and k If R 3, allocate
n, j and k
9
Register Allocation by Graph Coloring

Observation We cannot allocate two values to the
same register if they are needed at the same time
at some point in the program. They are said to
interfere.

s1 to s6 are called the candidates. We wish to
allocate each in one of 3 machine registers. We
assume them no longer needed at end.
s1 2 s2 4 s3 s1 s2 s4 s1 1 s5 s1
s2 s6 s4 2
A register holding a variable whose value is no
longer needed may be de-allocated and allocated
to another variable.
10
Register Allocation - Overview

We encode interference information in a graph
called the interference graph.
It is an undirected graph in which each vertex is
a candidate or a machine register.
There is an edge between two vertices if the
corresponding two candidates interfere (i.e.,
they cannot occupy the same register), or if a
candidate cannot be allocated to a specific
machine register. Machine registers interfere
with one another

s1
s3
s5
r1
s2
s4
s6
r2
r3
11
Register Allocation - Overview

The register allocation problem is now to assign
each vertex in the graph a register such that no
two vertices that are connected by an edge are
assigned the same register.
This problem is an instance of a problem known as
Graph Coloring Given R colors, is it possible to
assign each vertex one color such that no two
connected vertices have the same color?
If yes, the graph is said to be R-colorable.

s1
s3
s5
r1
3-colorable?
s2
s4
s6
r2
r3
12
Register Allocation - Overview

The register allocation problem is now to assign
each vertex in the graph a register such that no
two vertices that are connected by an edge are
assigned the same register.
This problem is an instance of a problem known as
Graph Coloring Given R colors, is it possible to
assign each vertex one color such that no two
connected vertices have the same color?
If yes, the graph is said to be R-colorable.

3-colorable?
Yes!
13
Register Allocation - Overview

Assigning candidates to registers is now easy
once the graph has been colored.

r3 2 r2 4 r1 r3 r2 r1 r3 1 r3 r3
r2 r1 r1 2
14
Register Allocation - Overview

When an interference graph cannot be R-colored,
we cannot assign all candidates to registers.
We store some candidates in memory, load before
each use and store after each definition.
This is referred to as spilling.
We select the smallest number of candidates whose
spilling to memory will make the graph
R-colorable.
We re-write the code to include spills and
rebuild the interference graph and color it.

15
Register Allocation - Overview

In our example, assume that we have only 2
registers. Hence, we want to find out if the
graph is 2-colorable.

s5
r1
s1
s1
s3
r2
s2
s4
s6
Not 2-colorable!

Spill s1 to memory.

16
Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 load s9 from
Mx s5 s9 s2 s6 s4 2
s4
Add edge between s4 and s5
Not 2-colorable!

Spill s4 to memory.

17
Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 store s4 to
My load s9 from Mx s5 s9 s2 load s10 from
My s6 s10 2
18
Register Allocation - Overview
s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 store s4 to
My load s9 from Mx s5 s9 s2 load s10 from
My s6 s10 2
It is 2-colorable!

We can now generate register assignments and code.

19
Register Allocation - Overview

Hence, the steps involved in register allocation
to R registers are
Identify candidates.
build interference graph.
color the interference graph.
if graph is R-colorable, the doneelse
select a victim for spilling.
re-write code.
repeat.
We will now examine each of these steps in some
detail.
Register allocation is iterative and is
time-consuming.
There are other complications (e.g.,
calling/return conventions, register windows,
etc) that we will ignore.

20
Identifying Candidates

Using variable names as candidates is not a good
approach
all uses of a name will be allocated to the same
register, but the name may be defined multiple
times, and hence takes different values. No
reason to hold in the same register!
Hence, we define the notion of webs as
equivalence classes for name uses the same name
in two different webs implies that the name takes
an independent value in each web.
webs separate the life-times of a variable.
each web may be allocated in a register!

i 0 i i1 i 6
i i-2
21
Webs (or Life-time Separation)

A definition an all its reachable uses are in the
same web.
All definitions that reach the same use are in
the same web.

22
Webs (or Life-time Separation)

A definition an all it reachable uses are in the
same web.
All definitions that reach the same use are in
the same web.

23
Life Ranges and Interference
x ...
BB 1
z ...
BB 2
BB 3
BB 4
LR(web 2) BB 2, BB 3)
z
x
y ...
BB 5
BB 6
LR(web 1) BB 1, BB 2, BB 3, BB 4, BB 5, BB
6, BB 7)
y
LR(web 3) BB 5, BB 6)
x
BB 7
24
Life Ranges and Interference

A set S of basic blocks is said to be convex if
BB a and BB b are in S and c is a BB on a path
from a to b, then c is in S.
The live range of a web is the minimal convex set
of instructions that include all the definitions
and uses in the web.
Intuitively, the live range of a web is the
region of BBs in which the web is live.
Two webs interfere if their life ranges
intersect.
Two webs that interfere must be allocated to
different registers.
The interference of the webs is captured using
the interference graph described earlier.

25
Practical Note Intersection of Live Ranges

Sufficient to include an arc if one of the nodes
is live at the definition of the other.

ENTRY
?
define a1 an
define b1 bn
?
use a1 an
use b1 bn
EXIT
26
Graph Coloring

Given a graph G(V,E), is it possible to assign
each vertex in the graph a color such that no two
adjacent vertices have the same color?
What is the smallest number of colors?
Is the graph R-colorable?
The problem is NP-complete for R ³ 3.
However, there exists a good heuristic.

27
Graph Coloring - Degree lt R rule

Given a graph G(V,E) with a vertex v with degree
lt R. G is R-colorable if and only if the graph
G(V-v,E) is R-colorable.
If G is R-colorable, then adding the vertex v
will result in a colorable graph. We pick a color
for v that different from the colors of vertices
connected to v. We know we have enough colors
because D(v) lt R.

D(v) lt R
28
Graph Coloring Heuristic

Remove vertices v (and associated edges) with
D(v) lt R.
remove one at a time and push on a stack.
When graph is empty, start to color
pop a vertex from the stack.
assign it a color different from vertices it is
connected to.
a color always exists!

29
Graph Coloring Heuristic - Example
3-colorable?
30
Graph Coloring - Degree lt R rule
3-colorable?
2-colorable?

Neither is R-colorable using the degree lt R rule!
Yet, they are both R-colorable!

31
Graph Coloring - Heuristic (Take 2)

Remove vertices v (and associated edges) with
D(v) lt R.
remove one at a time and push on a stack.
When all remaining vertices v have D(v) ³ R,
select a vertex to spill, mark accordingly.
remove the vertex and push on the stack
When graph is empty, start to color
pop a vertex from the stack.
assign it a color different from vertices it is
connected to.
a color always exists if Step 2 was not applied!
otherwise, there may or may not be a color
available.

Step 2
32
Graph Coloring - Example II
2-colorable?
33
Graph Coloring - Example III
3-colorable?
a
b

e
d
c
34
When no color is available?

Re-generate the code, spilling a node (which
node?)
or split a candidate into multiple candidates
Try to color the new code
Continue to repeat this process until R-colorable

35
Spilling

How to select a web to spill to memory.
One whose corresponding vertex v has D(v) ³ R.
One with minimal spill cost.
The spill cost is the cost of the extra loads and
stores used to retrieve and store the web to
memory.

s1 2 store s1 to Mx s2 4 load s7 from Mx s3
s7 s2 load s8 from Mx s4 s8 1 load s9 from
Mx s5 s9 s2 s6 s4 2
s1 2 s2 4 s3 s1 s2 s4 s1 1 s5 s1
s2 s6 s4 2
36
Spill Cost

The spill cost is determined by the dynamic cost
of the extra loads and stores.
This is not possible to compute because
we don not how often branches are taken in the
CFG,
we do not know how many times a loop iterates,
and
the dynamic cost may be input-dependent and will
vary from one execution to the next.
Hence, we statically estimate (reads approximate)
the spill cost based on the structure of the CFG.
loops play an important role.
assume loops execute 10 times.
may divide by vertex degree to favor vertex with
high degree.

37
Graph Coloring - Heuristic (Take 3)

Remove vertices v (and associated edges) with
D(v) lt R.
remove one at a time and push on a stack.
When all remaining vertices v have D(v) ³ R,
select a vertex to spill (smallest cost / D )
remove the vertex and push on the stack
When graph is empty, start to color
pop a vertex from the stack.
assign it a color different from vertices it is
connected to.
a color always exists if Step 2 was not applied!
otherwise, there may or may not be a color
available.

Step 2
38
Spill Cost - Example
two webs x and i
spill cost (x) 100 101 101 100
22
spill cost (i) 100 101 101 101 100
32
spill x to memory!
only one register! cdef cuse 1
39
Splitting

Break a web into multiple webs to reduce
interference in the interference graph. This is
referred to as splitting.
Insert instructions to spill value to memory and
load it from memory at point of split.

3
1
2
2-colorable?
40
Representing the Interference Graph

Usually represented in 2 forms
Adjacency Matrix
a lower triangular matrix such that if iltj,
AdjMatrixi,j true if the ith and jth values
are adjacent and is false otherwise.
allows you to quickly identify if 2 nodes are
adjacent
good for register coalescing
Adjacency Lists
an array of lists of adjacent nodes. Each
element is a record holding information about the
node, e.g. color chosen for the node, spill
location, spill cost,
used for graph coloring
is easily built from the Adjacency Matrix
representation

41
Register Allocation Overview

Hence, the steps involved in register allocation
to R registers are
Identify candidates.
build Adjacency Matrix representation of the
interference graph
do register coalescing
build the Adjacency List representation of the
interference graph
color the interference graph.
if graph is R-colorable, the doneelse
select a victim for spilling.
re-write code.
repeat.

42
Register Coalescing

Eliminates copies from 1 register to another
remove unnecessary copies from SSA
remove moves to required register locations
Search IR for sj si such that si and sj do not
interfere with each other or neither si nor sj
are stored to between the copy assignment and the
end of the routine.
find instructions that wrote si and replace si
with sj.
update the interference graph
anything that interfered with si or sj now
interferes with sj
however, if using definitions to determine
interference, definition of sj at sj si is now
gone. May remove some interference.