Title: Intermediate Code Generation
1Intermediate Code Generation
- Saumya Debray
- Dept. of Computer Science
- The University of Arizona Tucson, AZ 85721
2Constructing Abstract Syntax Trees
- General Idea construct bottom-up using
synthesized attributes. - E ? E E
mkTree(PLUS, 1, 3) - S ? if ( E ) S OptElse mkTree(IF, 3,
5, 6) - OptElse ? else S 2
- / epsilon / NULL
- S ? while ( E ) S
mkTree(WHILE, 3, 5) - mkTree(NodeType, Child1, Child2, ) allocates
space for the tree node and fills in its node
type as well as its children.
3Constructing DAGs Value Numbering
- In compilers, nodes are often implemented as
records stored in an array or list, and referred
to by index or position. - For historical reasons, the integer index of a
node is often called a value number. - Example x y 10 (value numbers are in
blue)
4Algorithm for Constructing a DAG
- Goal given an expression x ? y, use value
numbers to (efficiently) find whether it is
available. - Method
- search the list of records for a node with label
? and children x and y. - If found, return the record otherwise, create a
new record for it, and return that. - Implementation Use a hash table that can be
searched based on ??, x, y?.
5Value Numbering and DAGs
- Algorithm for Constructing a DAG
- Given an expression e ? e1 ? e2
- Get the value numbers for e1 and e2, say n1 and
n2 respectively. - Construct a hash key k hash(?, n1, n2).
- If an entry with key k is found in the hash
table, with value number m - replace e with a reference to the node with value
number m. - else
- create a new node for e, with value number n
- Insert hash key k into the hash table, with
associated value number n.
6Three Address Code Representation
- Each instruction represented as a structure
called a quadruple (or quad) - contains info about the operation, up to 3
operands. - for operands use a bit to indicate whether
constant or Symbol Table pointer. - E.g.
- x y z
if ( x ? y ) goto L
7Three Address Code Generation
- Represented as a list of instructions.
- ? denotes concatenation of instruction
sequences. - Attributes for Expressions
- E.place location that holds the value of E
- E.code instruction sequence to evaluate E.
- Attributes for Statements
- S.begin first instruction in the code for S
- S.after first instruction after the code for S.
8Intermediate Code Generation
- Auxiliary Routines
- struct symtab_entry newtemp(typename t)
- creates a symbol table entry for new temporary
variable each time it is called, and returns a
pointer to this ST entry. - struct instr newlabel()
- returns a new label instruction each time it is
called. - struct instr newinstr(arg1, arg2, )
- creates a new instruction, fills it in with the
arguments supplied, and returns a pointer to the
result.
9Intermediate Code Generation
- struct symtab_entry newtemp( t )
-
- struct symtab_entry ntmp malloc(
) / check ntmp NULL? / - Name(t) create a new name
- Type(ntmp) t
- Scope(ntmp) LOCAL
- return ntmp
-
- struct instr newinstr(opType, src1, src2, dest)
-
- struct instr ninstr malloc( )
/ check ninstr NULL? / - Op(ninstr) opType
- Src1(ninstr) src1 Src2(ninstr)
src2 Dest(ninstr) dest - return ninstr
-
10Simple Expressions
- E returns a struct with the following synthesized
attributes - type the type of the expression (or error)
- place a symbol table entry indicating the
location where the expression value will be kept
at runtime - code a list of intermediate code instructions
for evaluating the expression.
11Accessing Array Elements
- Accessing A i for an array Alohi starting
at address b, where each element is w bytes wide - Address of A i is b ( i lo ) ? w
- (b lo) ? w
i ? w - kA i ? w.
- kA depends only on A known at compile time.
- Code generated
- t1 i ? w
- t2 kA t1 / address of A i /
- t3 ?t2
12Accessing Structure Fields
- Use the symbol table to store information about
the order and type of each field within the
structure. - Hence determine the distance from the start of a
struct to each field. - For code generation, add the displacement to the
base address of the structure to get the address
of the field. - Example Given
- struct s p
-
- x p?a / a is at displacement ?a
within struct s / - The generated code has the form
- t1 p ?a / address of p?a /
- x ?t1
13Logical Expressions 1
- Production B ? E1 relop E2
- Naïve but Simple Code (TRUE1, FALSE0)
- t1 evaluate E1
- t2 evaluate E2
- t3 1 / TRUE /
- if ( t1 relop t2 ) goto L
- t3 0 / FALSE /
- L
- Disadvantage lots of unnecessary memory
references.
14Logical Expressions 2
- Observation Logical expressions are used mainly
to direct flow of control. - Intuition tell the logical expression where to
branch based on its truth value. - The nonterminal B now takes two inherited
attributes, true and false. Each is (a pointer
to) a label instruction. - E.g. for a statement if ( B ) S1 else
S2 - B.true S1.start
- B.false S2.start
- The code generated for B jumps to the appropriate
label.
15Logical Expressions 2 contd
- Production
- B ? E1 relop E2 B.code E1.code ? E2.code ?
-
newinstr(relop, E1.place, E2.place, B.true) ? -
newinstr(GOTO, B.false, NULL, NULL) - Example B ? xy gt 2z.
- Suppose B.true Lbl1,
B.false Lbl2. - E1 ? xy, E1.place tmp1, E1.code ? ? Plus(x,
y, tmp1) ? / tmp1 x y / - E2 ? 2z, E2.place tmp2, E2.code ? ? Mult(2,
z, tmp2) ? / tmp2 2 z / - B.code E1.code ? E2.code ? gt(tmp1, tmp2,
Lbl1) ? goto Lbl2 - ? Plus(x, y, tmp1), Mult(2, z,
tmp2), gt(tmp1, tmp2, Lbl1), goto Lbl2 ?
16Short Circuit Evaluation
- B ? B1 B2
- L newlabel( ) B1.true L
B1.false B.false - B2.true B.true B2.false
B.false - B.code B1.code ? L ?
B2.code
- B ? B1 B2
- L newlabel( ) B1.true
B.true B1.false L - B2.true B.true B2.false
B.false - B.code B1.code ? L? B2.code
17Short Circuit Evaluation Examples
- x gt 0 x lt 25
- B true Lt, false Lf
- B1 L_true L1, L_false Lf
- if x gt 0 goto L1
- goto Lf
- L1
- B2 L_true Lt, L_false Lf
- if (x lt 25) goto Lt
- goto Lf
-
- x gt 0 x lt 25
- B true Lt, false Lf
- B1 L_true Lt, L_false L1
- if x gt 0 goto Lt
- goto L1
- L1
- B2 L_true Lt, L_false Lf
- if (x lt 25) goto Lt
- goto Lf
18Conditionals
- Production
- S ? if ( B ) S1 else S2
- Code Structure
- code to evaluate B
- code for S1
- goto L
- code for S2
- L
- Semantic Rules
- Lelse newlabel()
- Lafter newlabel()
- B.true S1.begin
- B.false Lelse
- S.code B.code ?
- S1.code ?
- newinstr(GOTO, Lafter) ?
- newinstr(LABEL, Lelse) ?
- S2.code ?
- newinstr(LABEL, Lafter)
19Loops 1
- Production
- S ? while ( B ) S1
- Code Structure
- Ltop code to evaluate B
- if ( !B ) goto Lafter
- code for S1
- goto Ltop
- Lafter
- Semantic Rules
- Ltop newlabel()
- Lafter newlabel()
- B.true S1.begin
- B.false Lafter
- S.code gen(Ltop, ) ?
- B.code ?
- S1.code ?
- newinstr(GOTO, Ltop) ?
- newinstr(LABEL, Lafter)
20Loops 2
- Production
- S ? while ( B ) S1
- Code Structure
- goto Leval
- Ltop
- code for S1
- Leval code to evaluate B
- if ( B ) goto Ltop
- Lafter
- This code executes fewer branch operations.
- Semantic Rules
- Leval newlabel()
- Lafter newlabel()
- B.true S1.begin
- B.false Lafter
- S.code
- newinstr(GOTO, Leval) ?
- S1.code ?
- newinstr(LABEL, Leval,) ?
- B.code ?
- newinstr(LABEL, Lafter)
21Assignments
- Production
- S ? Lhs Rhs
- Semantic Rule
- S.code Lhs.code ? Rhs.code ? gen(Lhs.place
Rhs.place) - Evaluation Order
- As described, Lhs is always evaluated before Rhs.
- If there are no language requirements for this, a
compiler may be able to generate better code by
choosing the evaluation order based on what Lhs
and Rhs look like.
22Multi-way Branches switch statements
- Goal
- generate code to (efficiently) choose amongst a
fixed set of alternatives based on the value of
an expression. - Implementation Choices
- linear search
- best for a small number of case labels (? 3 or 4)
- cost increases with no. of case labels later
cases more expensive. - binary search
- best for a moderate number of case labels (? 4
8) - cost increases with no. of case labels.
- jump tables
- best for large no. of case labels (? 8)
- may take a large amount of space if the labels
are not well-clustered.
23Background Jump Tables
- A jump table is an array of code addresses
- Tbl i is the address of the code to execute if
the expression evaluates to i. - if the set of case labels have holes, the
correspond jump table entries point to the
default case. - Bounds checks
- Before indexing into a jump table, we must check
that the expression value is within the proper
bounds (if not, jump to the default case). - The check
- lower_bound ? exp_value ? upper bound
- can be implemented using a single unsigned
comparison.
24Jump Tables contd
- Given a switch with max. and min. case labels
cmax and cmin, the jump table is accessed as
follows
25Jump Tables Space Costs
- A jump table with max. and min. case labels cmax
and cmin needs ? cmax cmin entries. - This can be wasteful if the entries arent dense
enough, e.g. - switch (x)
- case 1
- case 1000
- case 1000000
-
- Define the density of a set of case labels as
- density (cmax cmin ) / no. of case labels
- Compilers will not generate a jump table if
density below some threshold (typically, 0.5).
26Switch Statements Overall Algorithm
- if no. of case labels is small (? 8), use
linear or binary search. - use no. of case labels to decide between the two.
- if density ? threshold ( 0.5)
- generate a jump table
- else
- divide the set of case labels into sub-ranges
s.t. each sub-range has density ? threshold - generate code to use binary search to choose
amongst the sub-ranges - handle each sub-range recursively.
27Function Calls
- Caller
- evaluate actual parameters, place them where the
callee expects them - param x, k ? x is the kth actual
parameter of the call - save appropriate machine state (e.g., return
address) and transfer control to the callee - call p
- Callee
- allocate space for activation record, save
callee-saved registers as needed, update
stack/frame pointers - enter p
28Function Returns
- Callee
- restore callee-saved registers place return
value (if any) where caller can find it update
stack/frame pointers - retval x
- leave p
- transfer control back to caller
- return
- Caller
- save value returned by callee (if any) into x
- retrieve x
29Function Call/Return Example
- Source x f(0, y1) 1
- Intermediate Code Caller
- t1 y1
- param t1, 2
- param 0, 1
- call f
- retrieve t2
- x t21
- Intermediate Code Callee
- enter f / set up activation record
/ - / code for fs body /
- retval t27 / return the value of t27 /
- leave f / clean up activation record
/ - return
30Reusing Temporaries
- Storage usage can be reduced considerably by
reusing space for temporaries - For each type T, keep a free list of
temporaries of type T - newtemp(T) first checks the appropriate free list
to see if it can reuse any temps allocates new
storage if not. - putting temps on the free list
- distinguish between user variables (not freed)
and compiler-generated temps (freed) - free a temp after the point of its last use
(i.e., when its value is no longer needed).