CSCI 435 Compiler Design - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

CSCI 435 Compiler Design

Description:

L-attributed grammar ... LLgen code for L-attributed grammar for simple expression ... yacc grammar rules require exactly one synthesized attribute. ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 27
Provided by: OwenAst9
Category:

less

Transcript and Presenter's Notes

Title: CSCI 435 Compiler Design


1
CSCI 435 Compiler Design
  • Week 5 Class 3
  • Section 3.1.7 to 3.2.1
  • (230 to 244)
  • Ray Schneider

2
Topics of the Day
  • L Attributed Grammars
  • . . . and top-down parsers
  • . . . and bottom-up parsers
  • S Attributed Grammars
  • Equivalence of L-attributed and S-attributed
    Grammars
  • Manual Methods
  • Threading the AST

3
L-attributed grammars
  • parsing process creates nodes in the syntax tree
    from left to right
  • either TOP DOWN parent to children, or
  • BOTTOM UP children to parent
  • L-attributed grammar
  • attribute grammars which allow the attributes to
    be evaluated in one left-to-right traversal of
    the syntax tree
  • characterized by the fact that no dependency
    graph of any of its production rules has a
    data-flow arrow that points from a child to that
    child or to a child to the left of it.

4
example(s)
  • Many programming language grammars are
    L-attributed such as our previously seen
    Constant_Definition dependency graph

Constant_Definition is L-attributed
Number is an example of a non-L
attributed grammar
5
Synthesized -upward
Inherited -downward
A
B4
B3
B1
B2
Left-to-Right
C1
C3
C2
An important consequence for the L-attributed
property is that once work on a node has started,
no part of the compiler will need to return to
one of the node's siblings on the left to do
processing there. Two sets of attributes play a
role in processing (C2)
1) nodes on path to top C2 B3 A
2)
siblings to left at all tiers
6
The point ...
  • the attributes of the node being processed and
    its path to the top is stored in the
    corresponding nodes
  • nodes to the left are complete except for the
    effects of synthesized attributes which can be
    stored in the parent node
  • at all times we can restrict ourselves at all
    times to the nodes that lie on the path from the
    top to the node being processed
  • this is like top-down parsing the code can be
    generated as the parsing passes through

7
L-attributed grammars and top down parsers
  • We saw LLgen earlier which injected parameters
    and code directly into the parser
  • Coordination of parsing and attribute evaluation
    is a great simplification but is applicable to a
    smaller set of attribute grammars
  • Next slide we look at a LLgen example which
    includes an example of an inherited attribute
    implemented as a synthesized attribute of the
    non-terminal and passed down through the
    processing of the production rule and the result
    is passed up as a value assigned to the memory
    location, t.

8
LLgen code for L-attributed grammar for simple
expression
include "symbol_table.h" lexical
get_next_token_class token IDENTIFIER token
DIGIT start Main_Program, main mainsymbol_tab
le sym_tab int result init_symbol_table(sy
m_tab) declarations(sym_tab)
expression(sym_tab, result) printf("result
d\n", result) declarations(symbol_table
sym_tab) declaration(sym_tab) ','
declaration(sym_tab) '' declaration(symbol_ta
ble sym_tab)symbol_entry sym_ent
IDENTIFIER sym_ent look_up(sym_tab,
Token.repr) '' DIGIT sym_ent-gtvalue
Token.repr '0' expression(symbol_table
sym_tab int e) int t term(sym_tab, t)
e t expression_tail_option(sym_tab,e) ex
pression_tale_option(symbol_table sym_tab inet
e) int t '-' term(sym_tab,t) e - t
expression_tail_option(sym_tab,e) term(symbol
_table sym_tab int t) IDENTIFIER t
look_up(sym_tab, Token.repr)-gtvalue
an inherited attribute, the symbol table
containing representations of some identifiers
and their integer values is produced as a
synthesized attribute by declarations in main.
symbol table passed down through
expression-gtexpression_tail_option-gt term
9
A more technical definition of a 'narrow' compiler
  • A narrow compiler is a compiler based formally or
    informally on some form of L-attributed grammar.
  • It does not save substantially more information
    than that which is present on the path from the
    top to the node being processed
  • Generally that path is proportional to ln(n)
    where n is the length of the program and the
    entire AST is proportional to n

10
L-attributed grammars and bottom-up parsers
  • attributed evaluation in bottom-up parsers is not
    so obvious, problem is that inherited attributes
    must be passed down and the proper path is not
    known until the node is resolved through handle
    detection
  • Yacc, the most famous LALR(1) parser generator
    does it anyway. How?
  • To the shifted terminals and reduced non-terminal
    stacks in a bottom-up parser we add an attribute
    stack containing the attributes of each stack
    element in the same order

11
Code in the middle
  • A? B C becomes
  • A? B C.inh_attr f(B.syn_attr) C
  • where B.syn_attr is a synthesized attribute of B
    and
  • C.inh_attr is an inherited attribute of C
  • the code is attached to an e rule say A_action1
  • A? B A_action1 C
  • A_action1? e C.inh_attr f(B.syn_attr)
  • The rule is at the end of an alternative and can
    be exercised when the alternative is reduced
  • Only works if A? B ? C is the only hypothesis
  • Another trick
  • lay out the attribute stack so that the one and
    only synthesized attribute of one node is in the
    same position as the one and only inherited
    attribute of the next node (no code needs to be
    executed in between, i.e. no code in the middle
    of a grammar rule)

12
S-attributed grammars
  • S-attributed grammars have no inherited
    attributes at all (seems like a tough
    restriction)
  • BUT anything that can be done in an L-attributed
    grammar can be done in an S-attributed grammar
  • Now we just stack the synthesized attributes and
    at the end of the alternative the parent scoops
    up all the attributes and processes them
  • new_expr()-gttype'-'-gtexpr1-gtterm
    3
  • yacc grammar rules require exactly one
    synthesized attribute. If more than one is
    required a record is returned, forming the only
    attribute

13
Equivalence of L and S attributed grammars
  • L-attributed grammars are rather easily converted
    to S-attributed grammars
  • The Method Delay any computation that cannot be
    done now to a later moment when it can be done
  • Implementation create a data structure which is
    populated with the computations specified for
    inherited attributes plus synthesized attributes
    and pass it up the levels until the computation
    can be made
  • Simple in principle but only practically feasible
    for small problems

14
Manual Methods
  • Much context processing is still done manually by
    writing code in traditional languages like C or
    C
  • Two methods to collect context information from
    the AST are
  • Symbolic Interpretation, and
  • Data-Flow Equations
  • Start with AST developed by syntax analysis
  • Add to each node flow of control information
  • typically in the form of successor pointers
    linking the nodes of the AST to form the
    additional data structure, THE CONTROL FLOW GRAPH

15
Constructing the Control Flow Graph
  • Statically by THREADING the tree
  • Starts by getting a pointer for node type N (a
    non-terminal), determines which production rule
    of N describes the node, and calls threading
    routines for its children
  • Accomplished by building a dynamic data structure
    recursively using
  • Last node pointer is the dynamically last node
  • each new node N is stored in Last node pointer
    .successor and then the Last node pointer is made
    to point to N.

16
ex. threading routine for a binary expression
PROCEDURE Thread binary expression (Expr node
pointer) Thread expression (Expr node pointer
.left operand) Thread expression (Expr node
pointer .right operand) // link this node to the
dynamically last node SET Last node pointer
.successor TO Expr node pointer // make this
node the new dynamically last node SET Last
node pointer TO Expr node pointer

Control-flow Graph of bb
4ac Statically the node is the first node,
but dynamically, at run time, the left-most b is
the first node
17
include "parser.h" /for types AST_node and
Expression / include "thread.h" / for self
check / / PRIVATE
/ static AST_node Last_node static void
Thread_expression(Expression expr) switch
(expr-gttype) case 'D'
Last_node-gtsuccessorexpr Last_nodeexpr
break case 'P' Thread_expression(expr-gt
left) Thread_expression(expr-gtright)
Last_node-gtsuccessor expr Last_node expr
break AST_node Thread_start void
Thread_AST(AST_node icode) AST_node
Dummy_node Last_node Dummy_node
Thread_expression(icode) Last_node-gtsuccessor
(AST_node )0 Thread_startDummy_node.succes
sor
Global variable
Threading code for demo compiler from Section 1.2
18
Complications caused by flow of control
  • example the IF statement (two problems)
  • 1. node corresponding to run time decision has
    two successors not one
  • 2. when we get to the end of the IF its address
    must appear in both the then-part and the
    else-part so a single Last node pointer is no
    longer sufficient
  • Solution 1. Store two successor pointers in the
    IF node
  • Solutions to 2.
  • replace Last node pointer by a set of last nodes,
    or
  • construct a special join node to merge the
    diverging flow of control (such a node is part of
    the Control Flow Graph but not of the AST)

19
Simple Threading Routine for if-statements
PROCEDURE Thread if statement (IF node pointer)
Thread expression (If node pointer
.condition) SET Last node pointer .successor
to IF node pointer SET End if join node TO
Generate join node() SET Last node pointer
TO address of a local node Aux last node
Thread block (If node pointer .then part) SET
If node pointer .true successor TO Aux last node
.successor SET Last node pointer .successor
TO address of End if join node SET Last node
pointer TO address of Aux last node Thread
block (If node pointer .else part) SET If
node pointer .false successor TO Aux last node
.successor SET Last node pointer .successor
TO address of End if join node SET Last node
pointer TO address of End if join node
20
AST and control flow graph after threading
Note that the Last node pointer has been moved to
point to the end-if join node.
21
Using an attribute grammar
  • Threading the AST can be expressed using an
    attribute grammar
  • successor pointers are implemented as inherited
    attributes, and
  • each node has an additional synthesizing
    attribute that is set by the evaluation rules to
    the pointer to the first node to be executed in
    the tree.

If_statement (INH successor, SYN first)? 'IF'
Condition 'THEN' Then_part 'ELSE' Else_part 'END'
'IF' ATTRIBUTE RULES SET IF_statement
.first TO Condition .first SET Condition
.true successor TO Then_part .first SET
Condition .false successor TO Else_part .first
SET Then_part .successor TO If_statement
.successor SET Else_part .successor TO
If_statement .successor
22
Doubly-linked list representation of CFG
A doubly linked list implementation of the
control flow graph provides a set of pointers for
each node dynamic successors and dynamic
predecessors. This gives algorithms traversing
the control flow graph great freedom of movement
and proves particularly useful when processing
data-flow equations
End_if
23
Next time
  • We'll start looking at two manual methods for
    context handling
  • SYMBOLIC INTERPRETATION and
  • DATA FLOW EQUATIONS
  • Then we'll start looking at the process of
    Intermediate Code Generation

24
Homework for Week 7
  • Get Lex/Flex generated code from page 95 figure
    2.41 to run under Visual C
  • Hints resolve function conflicts by adding
    include ltappropriate librarygt in the generated
    C-file. ex. exit, malloc, realloc, and free are
    in ltstdlib.hgt, the strcpy() function is in
    ltstring.hgt
  • You can include the default main() by adding a 1
    to the line define YY_MAIN 1
  • And then amplify the default main to called
    get_next_token() and print out appropriate results

25
References
  • Text Modern Compiler Design

26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com