Code Generation

About This Presentation

Title:

Code Generation

Description:

Code Generation The target machine Instruction selection and register allocation Basic blocks and flow graphs A simple code generator Peephole optimization – PowerPoint PPT presentation

Number of Views:107

Avg rating:3.0/5.0

Slides: 109

Provided by: 4518

Category:

more less

Transcript and Presenter's Notes

Title: Code Generation

1
Code Generation

The target machine
Instruction selection and register allocation
Basic blocks and flow graphs
A simple code generator
Peephole optimization
Instruction selector generator
Graph-coloring register allocator

2
The Target Machine

A byte addressable machine with four bytes to a
word and n general purpose registers
Two address instructions
op source, destination
Six addressing modes
absolute M M 1
register R R 0
indexed c(R) ccontent(R) 1
ind register R content(R) 0
ind indexed c(R) content(ccontent(R)) 1
literal c c 1

3
Examples
MOV R0, M MOV 4 (R0), M MOV R0, M MOV 4
(R0), M MOV 1, R0
4
Instruction Costs

Cost of an instruction 1 costs of source and
destination addressing modes
This cost corresponds to the length (in words) of
the instruction
Minimize instruction length also tend to minimize
the instruction execution time

5
Examples
MOV R0, R1 1 MOV R0, M 2 MOV 1,
R0 2 MOV 4 (R0), 12 (R1) 3
6
An Example
Consider a b c 1. MOV b,
R0 2. MOV b, a ADD c, R0 ADD c,
a MOV R0, a 3. R0, R1, R2 contains 4.
R1, R2 contains the addresses of a, b, c
the values of b, c MOV R1, R0
ADD R2, R1 ADD R2, R0 MOV R1, a
7
Instruction Selection

Code skeleton x y z a
b c d a e MOV y, R0
MOV b, R0 MOV a, R0
ADD z, R0 ADD c, R0 ADD
e, R0 MOV R0, x MOV R0, a
MOV R0, d
Multiple choices a a 1 MOV
a, R0 INC a ADD
1, R0 MOV R0, a

8
Register Allocation

Register allocation select the set of variables
that will reside in registers
Register assignment pick the specific register
that a variable will reside in
The problem is NP-complete

9
An Example
t a b t a b t t
c t t c t t / d t t /
d MOV a, R1 MOV a, R0 ADD b, R1 ADD b,
R0 MUL c, R0 ADD c, R0 DIV d, R0 SRDA R0,
32 MOV R1, t DIV d, R0 MOV R1, t
10
Basic Blocks

A basic block is a sequence of consecutive
statements in which control enters at the
beginning and leaves at the end without halt or
possibility of branching except at the end

11
An Example
(1) prod 0 (2) i 1 (3) t1 4
i (4) t2 at1 (5) t3 4 i (6) t4
bt3 (7) t5 t2 t4 (8) t6 prod
t5 (9) prod t6 (10) t7 i
1 (11) i t7 (12) if i lt 20 goto (3)
12
Flow Graphs

A flow graph is a directed graph
The nodes in the graph are basic blocks
There is an edge from B1 to B2 iff B2 immediately
follows B1 in some execution sequence
B2 immediately follows B1 in program text
there is a jump from B1 to B2
B1 is a predecessor of B2, B2 is a successor of B1

13
An Example
(1) prod 0 (2) i 1 (3) t1 4
i (4) t2 at1 (5) t3 4 i (6) t4
bt3 (7) t5 t2 t4 (8) t6 prod
t5 (9) prod t6 (10) t7 i
1 (11) i t7 (12) if i lt 20 goto (3)
B0
B1
14
Construction of Basic Blocks

Determine the set of leaders
the first statement is a leader
the target of a jump is a leader
any statement immediately following a jump is a
leader
For each leader, its basic block consists of the
leader and all statements up to but not including
the next leader or the end of the program

15
Representation of Basic Blocks

Each basic block is represented by a record
consisting of
a count of the number of statements
a pointer to the leader
a list of predecessors
a list of successors

16
Define and Use

A three address statement x y z is said
to define x and to use y and z
A name is live in a basic block at a given point
if its value is used after that point, perhaps in
another basic block

17
Next-Use Information
i x no assignment to x j y
x Statement j uses the value of x
defined at i
18
An Example
b(1), c(1,4), d(2)
(1) a b c a(2,3,5), c(4),
d(2) (2) e a d a(3,5), c(4), e(3)
(3) f e - a a(5), c(4), f(4) (4) e
f c a(5), e(5) (5) g e - a
g(?)
b, c, d are live at the beginning of the block
19
Computing Next Uses

Scan statements i x y op z backward
Attach to statement i the information currently
found in the symbol table regarding the next uses
and liveness of x, y, and z
In the symbol table, set x to not live and
clear the next uses of x
In the symbol table, set y and z to live and
add i to the next uses of y and z

among blocks
within blocks
20
A Simple Code Generator

Consider each statement in a basic block in turn,
remembering if operands are in registers
Assume that
each operator has a corresponding target language
operator
computed results can be left in registers as long
as possible, unless
out of registers
at the end of a basic block

21
Register and Address Descriptors

A register descriptor keeps track of what is
currently in each register
An address descriptor keeps track of the
location(s) where the current value of the name
can be found at run time

22
An Example
d (a - b) (a - c) (a -
c) t a - b MOV a,
R0 R0(t) SUB b, R0 t(R0) u a
- c MOV a, R1 R0(t), R1(u) SUB c,
R1 t(R0), u(R1) v t u ADD R1,
R0 R0(v), R1(u) v(R0), u(R1) d
v u ADD R1, R0 R0(d) d(R0) MOV
R0, d
23
Code Generation Algorithm

Consider an instruction of the form x y op
z
Invoke getreg to determine the location L where
the result of y op z will be placed
Determine a current location y of y from the
address descriptor (register location preferred).
If y is not L, generate MOV y, L
Generate op z, L, where z is a current
location of z from the address descriptor.
Update the address and register descriptors for
x, y, z, and L

24
Code Generation Algorithm

Consider an instruction of the form x y
If y is in a register, change the register and
address descriptors
If y is in memory,
if x has next use in the block, invoke getreg to
find a register r, generate MOV y, r, and
make r the location of x
otherwise, generate MOV y, x

25
Code Generation Algorithm

Once all statements in the basic block are
processed, we store those names that are live on
exit and not in their memory locations

26
The Function getreg

Consider an instruction of the form x y op
z
If y is in a register r that holds the value of
no other names, and y is not live and no next
uses after this statement, return r
Otherwise, return an empty register r if there is
one
Otherwise, if x has a next use in the block, or
op is an operator requiring a register, find an
occupied register r. Store the value of r, update
address descriptor, and return r
If x has no next use, or no suitable occupied
register can be found, return the memory location
of x

27
An Example
d (a - b) (a - c) (a -
c) t a - b MOV a,
R0 R0(t) SUB b, R0 t(R0) u a
- c MOV a, R1 R0(t), R1(u) SUB c,
R1 t(R0), u(R1) v t u ADD R1,
R0 R0(v), R1(u) v(R0), u(R1) d
v u ADD R1, R0 R0(d) d(R0) MOV
R0, d
28
Indexing and Pointer Operations
i in Ri i in
Mi i in Si(A) a bi MOV
b(Ri), R MOV Mi, R MOV Si(A), R
MOV b(R), R
MOV b(R), R ai b MOV b, a(Ri) MOV Mi,
R MOV Si(A), R
MOV b, a(R) MOV b, a(R)
p in Rp p in Mp p
in Sp(A) a p MOV Rp, R MOV Mp, R
MOV Sp(A), R
MOV R, R MOV R, R p a
MOV a, Rp MOV Mp, R Mov a, R
MOV a, R
MOV R, Sp(A)
29
Conditional Statements

Condition codes if x lt y goto z CMP x,
y CJlt z
Conditon code descriptors x y z MOV
y, R0 if x lt 0 goto z ADD z,
R0 MOV R0, x CJlt z

30
Global Register Allocation

Keep live variables in registers across block
boundaries
Keep variables frequently used in inner loops in
registers

31
Loops

A loop is a collection of nodes such that
all nodes in the collection are strongly
connected
the collection of nodes has a unique entry
An inner loop is one that contains no other loops

32
Variable Usage Counts

Savings
Count a saving of one for each use of x in loop L
that is not preceded by an assignment to x in the
same block
Save two units if we can avoid a store of x at
the end of a block
Costs
Cost two units if x is live at the entry or exit
of the inner loop

33
An Example
34
An Example
use(a, B1) 0, use(a, B2) 1 use(a, B3) 1,
use(a, B4) 0 live(a, B1) 1, live(a, B2)
0 live(a, B3) 0, live(a, B4) 0 save(a)
(0110) 2 ? (1000) 4 save(b) 5
save(c) 3 save(d) 6 save(e) 4
save(f) 4
35
An Example
36
Register Assignment for Outer Loops

Apply the same idea for inner loops to
progressively larger loops
If an outer loop L1 contains an inner loop L2, a
name allocated a register in L2 need not be
allocated a register in L1-L2
If name x is allocated a register in L1 but not
L2, need store x on entrance to L2 and load x on
exit from L2
If name x is allocated a register in L2 but not
L1, need load x on entrance to L2 and store x on
exit from L2

37
Peephole Optimization

Improve the performance of the target program by
examining and transforming a short sequence of
target instructions
May need repeated passes over the code
Can also be applied directly after intermediate
code generation

38
Examples

Redundant loads and stores MOV R0, a MOV a, Ro
Algebraic Simplification x x 0 x
x 1
Constant folding x 2 3 x 5 y
x 3 y 8

39
Examples

Unreachable code define debug 0 if (debug)
(print debugging information) if 0 ltgt 1
goto L1 print debugging
informationL1 if 1 goto L1 print
debugging informationL1

40
Examples

Flow-of-control optimization goto L1 goto
L2 L1 goto L2 L2 goto L2 goto
L1 if a lt b goto L2 goto L3L1 if a
lt b goto L2 L3 L3

41
Examples

Reduction in strength replace expensive
operations by cheaper ones
x2 ? x x
fixed-point multiplication and division by a
power of 2 ? shift
floating-point division by a constant ?
floating-point multiplication by a constant

42
Examples

Use of machine Idioms hardware instructions for
certain specific operations
auto-increment and auto-decrement addressing mode
(push or pop stack in parameter passing)

43
DAG Representation of Blocks

Easy to determine
common subexpressions
names used in the block but evaluated outside the
block
names whose values could be used outside the block

44
DAG Representation of Blocks

Leaves labeled by unique identifiers
Interior nodes labeled by operator symbols
Nodes optionally given a sequence of identifiers,
having the value represented by the nodes

45
An Example
(1) t1 4 i (2) t2 at1 (3) t3 4
i (4) t4 bt3 (5) t5 t2 t4 (6) t6
prod t5 (7) prod t6 (8) t7 i 1 (9) i
t7 (10) if i lt 20 goto (1)
46
Constructing a DAG

Consider x y op z. Other statements can be
handled similarly
If node(y) is undefined, create a leaf labeled y
and let node(y) be this leaf. If node(z) is
undefined, create a leaf labeled z and let
node(z) be that leaf

47
Constructing a DAG

Determine if there is a node labeled op, whose
left child is node(y) and its right child is
node(z). If not, create such a node. Let n be the
node found or created.
Delete x from the list of attached identifiers
for node(x). Append x to the list of attached
identifiers for the node n and set node(x) to n

48
Reconstructing Quadruples

Evaluate the interior nodes in topological order
Assign the evaluated value to one of its attached
identifier x, preferring one whose value is
needed outside the block
If there is no attached identifier, create a new
temp to hold the value
If there are additional attached identifiers y1,
y2, , yk whose values are also needed outside
the block, add y1 x, y2 x, , yk x

49
An Example
prod
(1) t1 4 i (2) t2 at1 (3) t3
bt1 (4) t4 t2 t3 (5) prod prod
t4 (6) i i 1 (7) if i lt 20 goto (1)
prod0
(1)
i
b
a
20
i0
4
1
50
Arrays, Pointers, Procedure Calls
x ai x ai aj y z x z
ai aj y gt range analysis p w gt
aliasing analysis side effects caused by
procedure calls gt inter-procedural analysis
51
Ordering Rules

Any evaluation of or assignment to an element of
array a must follow the previous assignment of
that array if there is one
Any assignment to an element of array a must
follow any previous evaluation of a

52
Ordering Rules

Any use of any identifier must follow the
previous procedure call or indirect assignment
through a pointer if there is one
Any procedure call or indirect assignment through
a pointer must follow all previous evaluations of
any identifier

53
Generating Code From DAGs
t1 a b t2 c d t3 e - t2 t4 t1 - t3
(1) MOV a, R0 (2) ADD b, R0 (3) MOV c,
R1 (4) ADD d, R1 (5) MOV R0, t1 (6) MOV e,
R0 (7) SUB R1, R0 (8) MOV t1, R1 (9) SUB
R0, R1 (10) MOV R1, t4
54
Rearranging the Order
t2 c d t3 e - t2 t1 a b t4 t1 - t3
(1) MOV c, R0 (2) ADD d, R0 (3) MOV e,
R1 (4) SUB R0, R1 (5) MOV a, R0 (6) ADD b,
R0 (7) SUB R1, R0 (8) MOV R0, t4
55
A Heuristic Ordering for DAG

Attempt as far as possible to make the evaluation
of a node immediately follow the evaluation of
its left most argument

56
Node Listing Algorithm
while unlisted interior nodes remain do begin
select an unlisted node n, all of whose
parents have been listed list n while
the leftmost child m of n has no unlisted
parents and is not a leaf do begin list
m n m end end
57
An Example
t7 d e t6 a b t5 t6 - c t4 t5
t7 t3 t4 - e t2 t6 t4 t1 t2 t3
58
Generating Code From Trees

There exists an algorithm that determines the
optimal order in which to evaluate statements in
a block when the dag representation of the block
is a tree
Optimal order here means the order that yields
the shortest instruction sequence

59
Optimal Ordering for Trees

Label each node of the tree bottom-up with an
integer denoting fewest number of registers
required to evaluate the tree with no stores of
immediate results
Generate code during a tree traversal by first
evaluating the operand requiring more registers

60
The Labeling Algorithm
if n is a leaf then if n is the leftmost
child of its parent then label(n) 1
else label(n) 0 else begin let
n1, n2, , nk be the children of n ordered by
label so that label(n1) ? label(n2) ? ?
label(nk) label(n) max1? i ? k(label(ni)
i - 1) end
61
An Example
For binary interior nodes
62
Code Generation From a Labeled Tree

Use a stack rstack to allocate registers R0, R1,
, R(r-1)
The value of a tree is always computed in the top
register on rstack
The function swap(rstack) interchanges the top
two registers on rstack
Use a stack tstack to allocate temporary memory
locations T0, T1, ...

63
Cases Analysis
name
name
64
The Function gencode
procedure gencode(n) begin if n is a left leaf
representing operand name and n is the
leftmost child of its parent then print 'MOV'
name ',' top(rstack) else if n is an
interior node with operator op, left
child n1, and right child n2 then if
label(n2) 0 then / case 1 / else if 1?
label(n1) lt label(n2) and label(n1) lt r then /
case 2 / else if 1? label(n2) ? label(n1)
and label(n2) lt r then / case 3 / else /
case 4, both labels ? r / end
65
The Function gencode
/ case 1 / begin let name be the operand
represented by n2 gencode(n1) print op
name ',' top(rstack) end / case 2
/ begin swap(rstack) gencode(n2) R
pop(rstack) gencode(n1) print op R
',' top(rstack) push(rstack, R)
swap(rstack) end
66
The Function gencode
/ case 3 / begin gencode(n1) R
pop(rstack) gencode(n2) print op R
',' top(rstack) push(rstack, R) end /
case 4 / begin gencode(n2) T
pop(tstack) print 'MOV' top(rstack)
',' T gencode(n1) push(tstack,
T) print op T ',' top(rstack) end
67
An Example
gencode(t4) R1, R0 / 2 / gencode(t3)
R0, R1 / 3 / gencode(e) R0, R1 /
0 / print MOV e, R1 gencode(t2) R0
/ 1 / gencode(c) R0 / 0
/ print MOV c, R0 print ADD d,
R0 print SUB R0, R1 gencode(t1) R0
/ 1 / gencode(a) R0 / 0 /
print MOV a, R0 print ADD b, R0 print
SUB R1, R0
-
-

68
Multiregister Operations

Some operations like multiplication, division, or
a function call normally require more than one
register
The labeling algorithm needs to ensure that
label(n) is always at least the number of
registers required by the operation

69
Algebraic Properties
commutative
associative
commutative
largest
70
Common Subexpressions

Nodes with more than one parent in a dag are
called shared nodes
Optimal code generation for dags on both a
one-register machine or an unlimited number of
registers machine are NP-complete

71
Partitioning a DAG into Trees

Partition a dag into a set of trees by finding
for each root and shared node n, the maximal
subtree with n as root that includes no other
shared nodes, except as leaves
Determine a code generation ordering for the
trees
Generate code for each tree using the algorithms
for generating code from trees

72
An Example
1
3
2
1
6
4
4
3
2
e0
4
4
5
7
7
5
6
6
c0
e0
d0
c0
a0
b0
6
e0
a0
b0
73
Dynamic Programming Code Generation

The dynamic programming algorithm applies to a
broad class of register machines with complex
instruction sets
Machines has r interchangeable registers
Machines has instructions of the form Ri
Ewhere E is any expression containing operators,
registers, and memory locations. If E involves
registers, then Ri must be one of them

74
Dynamic Programming

The dynamic programming algorithm partitions the
problem of generating optimal code for an
expression into sub-problems of generating
optimal code for the sub-expressions of the given
expression

75
Contiguous Evaluation

We say a program P evaluates a tree T
contiguously if
it first evaluates those subtrees of T that need
to be computed into memory
it then evaluates the subtrees of the root in
either order
it finally evaluates the root

76
Optimally Contiguous Program

For the machines defined above, given any program
P to evaluate an expression tree T, we can find
an equivalent program P' such that
P' is of no higher cost than P
P' uses no more registers than P
P' evaluates the tree in a contiguous fashion
This implies that every expression tree can be
evaluated optimally by a contiguous program

77
Dynamic Programming Algorithm

Phase 1 compute bottom-up for each node n of the
expression tree T an array C of costs, in which
the ith component Ci is the optimal cost of
computing the subtree S rooted at n into a
register, assuming i registers are available for
the computation. C0 is the optimal cost of
computing the subtree S into memory

78
Dynamic Programming Algorithm

To compute Ci at node n, consider each machine
instruction R E whose expression E matches the
subexpression rooted at node n
Determine the costs of evaluating the operands of
E by examining the cost vectors at the
corresponding descendants of n

79
Dynamic Programming Algorithm

For those operands of E that are registers,
consider all possible orders in which the
corresponding subtrees of T can be evaluated into
registers
In each ordering, the first subtree corresponding
to a register operand can be evaluated using i
available registers, the second using i-1
registers, and so on

80
Dynamic Programming Algorithm

For node n, add in the cost of the instruction R
E that was used to match node n
The value Ci is then the minimum cost over all
possible orders
At each node, store the instruction used to
achieve the best cost for Ci for each i
The smallest cost in the vector gives the minimum
cost of evaluating T

81
Dynamic Programming Algorithm

Phase 2 traverse T and use the cost vectors to
determine which subtrees of T must be computed
into memory
Phase 3 traverse T and use the cost vectors and
associated instructions to generate the final
target code

82
An Example
Consider a machine with two registers R0 and
R1 and instructions Ri Mj Mi Ri Ri
Rj Ri Ri op Rj Ri Ri op Mj
83
An Example
R0 c R1 d R1 R1 / e R0 R0 R1 R1
a R1 R1 - b R1 R1 R0
84
Code Generator Generators

A tool to automatically construct the instruction
selection phrase of a code generator
Such tools may use tree grammars or context free
grammars to describe the target machines
Register allocation will be implemented as a
separate mechanism
Graph coloring is one of the approaches for
register allocation

85
Tree Rewriting

ai b 1
ind

memb
const1

ind
regsp

consta
consti
regsp
86
Tree Rewriting

The code is generated by reducing the input tree
into a single node using a sequence of
tree-rewriting rules
Each tree rewriting rule is of the
form replacement ? template action
replacement is a single node
template is a tree
action is a code fragment
A set of tree-rewriting rules is called a
tree-translation scheme

87
An Example
regi
?
ADD Rj, Ri
Each tree template represents a computation
performed by the sequence of machines
instructions emitted by the associated action
88
Tree Rewriting Rules
89
Tree Rewriting Rules
regi ?
ADD c(Rj), Ri
(6)

ADD Rj, Ri
regi ?
(7)
regi
regj

INC Ri
regi ?
(8)
const1
regi
90
An Example

ind

memb
const1

ind
regsp

consta
consti
regsp
(1)
MOV a, R0
91
An Example

ind

memb
const1

ind
regsp

reg0
consti
regsp
(7)
ADD SP, R0
92
An Example

ind

ADD i (SP), R0
memb
const1

reg0
ind

MOV i (SP), R1
(5)
consti
regsp
(6)
93
An Example

ind

memb
const1
reg0
(2)
MOV b, R1
94
An Example

ind

reg1
const1
reg0
(8)
INC R1
95
An Example

ind
reg1
reg0
(4)
MOV R1, R0
96
Tree Pattern Matching

The tree pattern matching algorithm can be
implemented by extending the multiple-keyword
pattern matching algorithm
Each tree template is represented by a set of
strings, each of which represents a path from the
root to a leave
Each rule is associated with cost information
The dynamic programming algorithm can be used to
select an optimal sequence of matches

97
Semantic Predicates

if c 1 then INC Ri else ADD c, Ri
regi
?
regi
constc
The general use of semantic actions and
predicates can provide greater flexibility and
ease of description than a purely grammatical
specification
98
Pattern Matching by Parsing

Use an LR parser to do the pattern matching
The input tree can be treated as a string by
using its prefix representation ind
consta regsp ind consti regsp memb
const1
The tree-translation scheme can be converted into
a syntax-directed translation scheme by replacing
the tree templates with their prefix
representations

99
Syntax-Directed Translation Scheme
(1) regi ? constc MOV c, Ri (2) regi
? mema MOV a, Ri (3) mem ? mema
regi MOV Ri, a (4) mem ? ind regi
regj MOV Rj, Ri (5) regi ? ind
constc regj MOV c(Rj), Ri (6) regi ?
regi ind constc regj ADD c(Rj), Ri (7)
regi ? regi regj ADD Rj, Ri (8)
regi ? regi const1 INC Ri
100
Advantages of Syntax-Directed Translation Scheme

The parsing method is efficient and well
understood
It is relatively easy to retarget the code
generator
The code generator can be made more efficient by
adding special-case productions

101
Disadvantages of Syntax-Directed Translation
Scheme

A left-to-right order of evaluation is fixed
The machine description grammar can become
inordinately large
Context free grammar is usually highly ambiguous

102
Graph Coloring

In the first pass, target machine instructions
are selected as though there were an infinite
number of symbolic registers
In the second pass, physical registers are
assigned to symbolic registers using graph
coloring algorithms
During the second pass, if a register is needed
when all available registers are used, some of
the used registers must be spilled

103
Interference Graph

For each procedure, a register-interference graph
is constructed
The nodes in the graph are symbolic registers
An edge connects two nodes if one is live at a
point where the other is defined

104
K-Colorable Graphs

A graph is said to be k-colorable if each node
can be assigned one of the k colors such that no
two adjacent nodes have the same color
A color represents a register
The problem of determining whether a graph is
k-colorable is NP-complete

105
A Graph Coloring Algorithm

Remove a node n and its edges if it has fewer
than k neighbors
Repeat the removing step above until we end up
with the empty graph or a graph in which each
node has k or more adjacent nodes
In the latter case, a node is selected and
spilled by deleting that node and its edges, and
the removing step above continues

106
A Graph Coloring Algorithm

The nodes in the graph can be colored in the
reverse order in which they are removed
Each node can be assigned a color not assigned to
any of its neighbors
Spilled nodes can be assigned any color

107
An Example
108
An Example

Write a Comment

User Comments (0)