CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

CSCI 435 Compiler Design

Description:

Return Mode, Names and Values of Exceptions, Label for Jump Mode, etc. ... dinosaur.compilertools.net/yacc/index.html and http://dinosaur.compilertools.net/lex ... – PowerPoint PPT presentation

Number of Views:79

Avg rating:3.0/5.0

Slides: 26

Provided by: OwenAst9

Category:

more less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design

1
CSCI 435 Compiler Design

Week 6 Class 3
Section 4 to Section 4.1.2
(279-290)
Ray Schneider

2
Topics of the Day

Processing the intermediate code
Interpretation
Recursive Interpretation
Iterative Interpretation

3
Where we are ...

Now we have an annotated syntax tree, either
actually in memory as a data structure (Broad
Compilers) or implicitly available during parsing
(Narrow Compiler).
The Annotated Syntax tree bears traces of its
origin, the language constructs and the like,
represented by nodes and subtrees, despite the
relative paradigm independence of the methods
being used
NOW THE NEXT STEP Transforming the AST into
Intermediate Code

4
Status of various modules in compiler construction
The AST is full of nodes reflecting the specific
semantic concepts of the source language.
Intermediate Code Generation reduces the set of
specific node types to a small set of general
concepts easily implemented on actual machines.
FIND and REWRITE Intermediate Code Generation
finds the language characteristic nodes and
subtrees in the AST and rewrites them into
subtrees that use only a small number of
features, each corresponding closely to a set of
machine instructions.
5
After FIND and REPLACE

The resulting tree should be called THE
INTERMEDIATE CODE TREE but is usually still
called the AST
Features of the Intermediate Code Tree are
expressions, including assignments
routine calls, procedure headings, and return
statements,
conditional and unconditional jumps
IN ADDITION
administrative features, ex. memory allocation
for global variables, activation record
allocation, and module linkage information
the entire range of high-level concepts of the
language is replaced by a few rather low-level
concepts

6
Processing the Intermediate Code

Involves either ...
A Little Pre-processing followed by execution on
an Interpreter, or
A lot of Pre-processing in the form of machine
code generation followed by execution on hardware
Whatever the processing system ...
Writing the Run-Time and Library system is the
majority of the work and is primarily just brute
force coding.
We will begin by looking at Interpretation

7
Simplest way ...process AST using an ...

INTERPRETER
An Interpreter considers the nodes of the AST in
the correct order and performs the prescribed
actions required by the semantics of the language
NOTE unlike compilation, the input data is
required
Interpreter performs actions similar to the CPU
except that it works on AST nodes rather than
Machine Instructions
A CPU by contrast works on Machine Instructions
given in the correct order and performs the
actions demanded by the language as translated
into the instructions required by the semantics
of the machine
TWO KINDS OF INTERPRETERS
RECURSIVE (works directly on the AST), and
ITERATIVE (works on a linearized version of the
AST)

8
Simple Recursive Compiler from 1.2.8 fig 1.19
(21)
9
Recursive Interpretation

An interpreting routine is provided for each node
type in the AST
Each such routine calls other similar routines
The meaning of the language constructs are
defined as a function of the meanings of their
components
The Interpretation Starts by calling the
interpretation routine for Program with the top
node of the AST as a parameter
An important ingredient of a Recursive
Interpreter is the UNIFORM SELF-IDENTIFYING DATA
REPRESENTATION

10
Uniform Self-Identifying Data Representation

The Interpreter has to manipulate data values of
unknown types and sizes that are not known when
the Interpreter is written
Implementation requires a generic model
implementing values as variable-size records that
specify the type of the run-time value
its size and the run time value itself
a POINTER to such a record serves as the VALUE
during Interpretation

11
Example Complex Numbers

Two Parts of Data Representation
Actual Values, vary from entity to entity
Type of Value, things in common

"re"
"im"
"real"
Specific to the given value of type
complex_number
Common to all values of type complex_number
12
Status Indicator another important feature

Used to direct the flow of control
Primary Component
Mode of Operation of the Interpreter an
enumeration value with normal value something
like "Normal Mode" indicating sequential flow of
control, other values like Jumps, Exceptions,
Function Returns
Second Component
value in the wider sense Supply more information
about the Non-Sequential Flow of Control, ex.
Return Mode, Names and Values of Exceptions,
Label for Jump Mode, etc.
Status Indicator should contain file name and
line number of text where status indicator was
created and possibly other debugging information
Each interpreting routine checks the status
indicator after each call to another routine to
see how to carry on

13
Outline of a routine for recursive interpretation
of an if-statement
PROCEDURE Elaborate if statement (If node)
SET Result TO Evaluate condition (If node
.condition) IF Status .mode / Normal mode
RETURN IF Result .type / Boolean
ERROR "Condition in if-statement is not of type
Boolean" RETURN IF Result .boolean
.value True Elaborate statement (If node
.then part) ELSE Result .boolean .value
False // Check if there is an else-part at
all IF If node .else part / No node
Elaborate statement (If node .else part)
ELSE If node .else part No node SET
Status .node TO Normal mode
14
Typical Handling of the Symbol Table

Variables, named constants, other named entities
are handled by the Symbol Table which is handled
like the example below for something like
variable V of type T say a record called
"Declarable"
a pointer to the name V,
the file name and line number of its declaration
an indication of its kind (variable, constant,
field selector, etc.)
a pointer to the type T
a pointer to newly allocated room for the value
of V
a bit telling whether or not V has been
initialized, if known
one or more scope- and stack- related pointers,
depending on the language
other data as required (language dependent)

15
Summary

Recursive Interpreter can generally be written
quickly, so useful for rapid prototyping
Not the best architecture for heavy duty
interpreting but good for debugging language
concepts and features
Big Disadvantage Very Slow, as much as 1000
times slower than a compiler for the same
language
This can be improved somewhat by doing as much
static context checking as possible in the
pre-interpretive phase (see Memoization pg.286)

16
Iterative Interpretation

Structure of an Iterative Interpreter is much
closer to that of a CPU than a Recursive
Interpreter is.
Consists of a flat loop over a case statement
which contains a code segment for each node type
the code segment for a node type implements the
semantics of that node type
It requires a fully annotated and threaded AST
and maintains an ACTIVE NODE POINTER which points
to the node being interpreted, i.e. the ACTIVE
NODE
The interpreter runs the code for the Active Node
which then points to another node, the successor
node.

17
include "parser.h" / for types AST_node and
Expression / include "thread.h" / for
Thread_AST() and Thread_start / include
"stack.h" / for Push() and Pop() / include
"backend.h" / for self check / static
AST_node Active_node_pointer static void
Interpret_iteratively(void) while
(Active_node_pointer ! 0) / there is
only one node type, Expression /
Expression expr Active_node-pointer
switch (expr-gttype) case 'D'
Push(expr-gtvalue) break case
'P' int e_left Pop() int e_right
Pop() switch (expr-gtoper)
case '' Push(e_left e_right) break
case '' Push(e_left e_right) break
break
Active_node_pointer Active_node_pointer-gtsuccess
or printf("d\n",Pop()) / print the
result / void Process(AST_node icode)
Thread_AST(icode) Active_node_pointer
Thread_start Interpret_iteratively()
An iterative interpreter for the demo compiler of
1.2 JUST A BIG SWITCH STATEMENT
18
the Iterative Interpreter 1

Data Structures resemble those inside a compiled
program more than those in a Recursive
Interpreter
ex. Array holding the global data, if source
language is stack oriented, then the iterative
compiler maintains a stack.
Variables and Entities have an address which is
generally an offset into a memory array
Symbol table is no longer relevant, but useful to
generate better error messages

19
the Iterative Interpreter 2

Iterative interpreter has more information about
run time events that a compiled program but less
than a recursive interpreter
one can make up for the lack of a symbol table in
an iterative interpreter by using SHADOW MEMORY
parallel to the memory arrays maintained by the
interpreter. The Shadow Memory holds properties
of the corresponding byte in memory, ex. "is
uninitialized", "is a non-first byte of a
pointer", "belongs to a read only array" the
different modes can be encoded with byte-codes
Some Iterative Interpreters store the AST in a
single array for several reasons
easier to write it to file
more compact representation
reusable without regenerating the AST

20
Three Forms of Storing an AST a Graph
21
Storing an AST in an array or as
pseudo-instructions
Array
condition
IF
condition
IF_FALSE
statement 1
statement 1
JUMP
statement 2
statement 2
statement 3
statement 3
statement 4
statement 4
Pseudo- Instructions
22
AST Constructions and interpretation

Usually puts the successor of a node right after
the node
may even omit the successor pointer altogether
and just make it the default and only include
pointers when the next node is NOT the successor
node
Historically an Iterative Interpreter mimics a
CPU working on a compiled program and the AST
array mimics the compiled program
Iterative Interpreters are easier to write even
than recursive interpreters and much easier than
compilers
Only serious deficiency is speed, even the best
interpreter is typically 30 times slower that an
optimized compiler

23
Next time ...

Next time we'll start looking at Code Generation
We'll spend about two or three classes on it.

24
Homework for Week 8

Bison Familiarization
Read the entire 39 pages of "A Compact Guide To
Lex and Yacc" // you can skim through it the
first time
THEN concentrate first on getting the lex example
on page 10 running
THEN after you have that running go on to
Practice, Part 1 and strive to get the primitive
calculator running (pages 14 through 17)
HINTS the lex input on page 10 can be made to
run by extending it with the line (cribbed from
our text)
int yywrap(void) return 1 //at the end, and
you have to put
include ltstdlib.hgt //at the top of your yy.lex.c
output then the code will add line number to a
text file reading the file name in from the
command line and sending the output to stdout.

25
References

Text Modern Compiler Design Figures
Lex A Lexical Analyzer Generator by M.E. Lesk
and E. Schmidt
Yacc Yet Another Compiler-Compiler by Stephen C.
Johnson
see http//dinosaur.compilertools.net/yacc/index.h
tml and http//dinosaur.compilertools.net/lex/in
dex.html

Write a Comment

User Comments (0)