Title: RTL Representation
1RTL Representation
- Kevin Klues
- Department of Computer Science and Engineering
- Fall 2005 Programming Language Seminar
2Structure of Presentation
- Introduction
- RTL Example
- RTL Structure and Use in GCC
- RTL Compiler Pass
- Reading RTL
- Interfacing with GCC using RTL
- Exercise Suggestions
3Introduction
- What is RTL?
- RTL Register Transfer Language
- Produced directly from Tree representation
- Breakdown of program into individual instructions
- Algebraic representation of each instruction
describing exactly what it does - Abstract low level intermediate representation
that will eventually be mapped into Assembler
code - Has both internal and textual form
- Why is it important to learn about RTL?
- MOST of work done by compiler done on RTL
representation - Implementation of a back end tightly coupled with
RTL
4Example Textual RTL
int add(int x, int y) return
xy int sub(int x, int y) return
x-y int main(int argc, char argv)
int x, y, z y 5 x 6
z add(x,y) x sub(z,y)
return 0
(call_insn 21 20 22 (nil) (set (regSI 0 eax)
(call (memQI (symbol_refSI ("add")) 0 S1
A8) (const_int 8 0x8))) -1 (nil)
(expr_listREG_EH_REGION (const_int 0 0x0)
(nil)) (nil))
(call_insn 29 28 30 (nil) (set (regSI 0 eax)
(call (memQI (symbol_refSI ("sub")) 0 S1
A8) (const_int 8 0x8))) -1 (nil)
(expr_listREG_EH_REGION (const_int 0 0x0)
(nil)) (nil))
5RTL Structure and Use in GCC
- RTL Object Types
- RTL Classes and Formats
- Access to Operands
- Flags in RTL Expressions
- Machine Modes
- RTL Representation of Functions
6RTL Object Types
- Integers
- Wide Integers
- Strings
- Expressions (RTX)
- Classified by expression codes (rtl.def)
- Code put into RTX format with PUT_CODE(x, code)
- Code of an RTX extracted with GET_CODE(x)
- Vectors
- Arbitrary number of pointers to expressions
7RTL Classes and Formats
- All written code associated with an expression
code - Expression code in turn associated with given
class - Can determine class of RTX with
GET_RTX_CLASS(code) - o,lt,1,c,3,2,b,I,m,g,a,x (see rtl.def file)
- Each expression code also has specified format
- Specifies number of contained objects and their
type - e,i, w, s, E, u, n, S, V, 0, , T, t, B, b (see
rtl.c file) - Defining RTX giving format and associating class
- DEF_RTL_EXPR(CONST, "const", "e", 'o')
- DEF_RTL_EXPR(JUMP_INSN, "jump_insn", "iuueiee0",
'i')
8Access to RTX operands
- XEXP, XINT, XWINT, XSTR
- Can see how used in rtl.c
- Any operand can be accessed as any type, so must
be careful - Type of each operand based on how defined in
rtl.def - Three ways to access Vector operands
- XVEC(exp, idx) Access to vector pointer at idx
in exp - XVECLEN(exp, idx) Access length of vector at
idx in exp - XVEC(exp, idx, elnum) Access RTX at elnum in
vector at idx in exp - Access to Special Operands
- MEM, REG, SYMBOL_REF
- Flags in RTL Expressions
- One-bit bit-fields used in certain types of
expressions - Can see how used in rtlanal.c, varasm.c
9Machine Modes
- Describes size of data object and representation
used for it - Declared in enum machine_mode (machmode.h file)
- Very few explicit references to an RTXs machine
mode - Divided into classes
- Declared in enum mode_class (machmode.h file)
- Macros exist for getting the mode, setting a
mode, etc. - All RTXs have room for machine mode declaration
- Some tree expressions did too (declarations,
types) - Machine modes exist that a given machine MUST
support - QImode Quarter Integer, represents single
byte as integer - Modes corresponding to BITS_PER_WORD,
FLOAT_TYPE_SIZE, DOUBLE_TYPE_SIZE
10Function Representation
- Doubly-linked chain of insn objects
- Insn is nothing more than RTX with special
expression code - Every insn contains at least three extra fields
- unique id-num, pointer to preceding insn, pointer
to next insn - Occupy same position in all insns, independent of
expression code - Accessed using INSN_UID, PREV_INSN, NEXT_INSN
macros - First insn in function obtained by calling
get_insns - Certain expression codes contain even more extra
fields - insn, jump_insn, call_insn (see rtl.def file)
- Special expression codes for functions calling
subroutines - call_insn caller of a subroutine
- call subroutine callee
- Machine mode of insn normally VOIDmode
- May change during different passes of compiler
11RTL Compiler Passes
- Source files for rtl generation are
- Stmt.c, calls.c, expr.c, explow.c, expmed.c,
function.c, optabs.c, emit-rtl.c, insn-emit.c,
expr.h, insn-flags.h, insn-codes.h - Passes
- Generate exception handling landing pads
(except.c) - Cleanup control flow graph (cfgcleanup.c,
cfgrtl.c, jump.c) - Common subexpression elimination (cse.c)
- Global common subexpression elimination (gcse.c,
lcm.c) - 2 Loop Optimization passes (loop.c, dependence.c,
loop-.c) - Jump bypassing (gcse.c)
- If conversion (ifcvt.c)
- Web construction (web.c)
- Life Analysis (flow.c)
- Instruction Combination (combine.c)
12RTL Compiler Passes (cont)
- Passes
- Register Movement (regmove.c)
- Optimize mode switching (mode-switching.c)
- Modulo-scheduling (modulo-sched.c)
- Instruction scheduling (haifa-sched.c, sched-.c)
- 4 Register allocation passes (regclass.c,
local-alloc.c, global.c, reload.c, reload1.c,
reload.h) - Basic Block Reordering (bb-reorder.c, predict.c)
- Variable Tracking
- Delayed branch scheduling (reorg.c)
- Branch Shortening
- Register-to-stack conversion (reg-stack.c)
- Final (final.c, insn-output.c)
- Debug information output (dbxout.c, sbout.c,
dwarf.c, vmsdbgout.c)
13Reading RTL
- To dump an rtl file when compiling your program,
add dr flags - gcc dr foo.c
- Creates foo.c.rtl file with textual rtl dump
- If egypt and graphviz installed, can get
graphical representation - egypt foo.c.rtl dotty
- egypt foo.c.rtl dot -Grotate90 -Gsize8.5,11
-Tps -o callgraph.ps - egypt foo.c.rtl dot -Gpage8.5,11 -Tps -o
callgraph.ps - To read RTL object from file within C code, call
read_rtx() - Takes single stdio stream argument
- Defined in read-rtl.c file
- Not available in compiler itself, but in programs
generating back-end
14Interfacing with GCC using RTL
- Common misconception about RTL
- People want to store RTL representation in text
file and use as interface between language front
end and gcc - INFEASIBLE because.
- GCC designed to use RTL internally only
- Correct RTL VERY dependant on particular target
machine - RTL does not contain complete description of all
information contained within a program - Proper way to interface GCC to new language is
through tree data structure already discussed - RTL for a particular machine then generated
automatically from this representation - See tree.h, tree.def, and notes from previous
classes
15Exercise Suggestions