Abstract Syntax - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Abstract Syntax

Description:

So far a parser traces the derivation of a sequence of tokens ... Calculated from attributes of descendents in the parse tree. E.val is a synthesized attribute ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 55
Provided by: alex261
Category:

less

Transcript and Presenter's Notes

Title: Abstract Syntax


1
Abstract Syntax
  • CMSC
  • CS431

2
Abstract Syntax Trees
  • So far a parser traces the derivation of a
    sequence of tokens
  • The rest of the compiler needs a structural
    representation of the program
  • Abstract syntax trees
  • Like parse trees but ignore some details
  • Abbreviated as AST

3
Abstract Syntax Tree. (Cont.)
  • Consider the grammar
  • E ? int ( E ) E E
  • And the string
  • 5 (2 3)
  • After lexical analysis (a list of tokens)
  • int5 ( int2 int3 )
  • During parsing we build a parse tree

4
AST Covered
  • We built AST by hand in the 1st Project
  • Lets see what the Galles text has to say about
    AST
  • Lets also look at some code

5
Example of Abstract Syntax Tree
PLUS
PLUS
2
5
3
  • Also captures the nesting structure
  • But abstracts from the concrete syntax
  • gt more compact and easier to use
  • An important data structure in a compiler

6
Example of Parse Tree
E
  • Traces the operation of the parser
  • Does capture the nesting structure
  • But too much info
  • Parentheses
  • Single-successor nodes

E
E

int5
(
E
)

E
E
int2
int3
7
Semantic Actions
  • This is what well use to construct ASTs
  • Each grammar symbol may have attributes
  • For terminal symbols (lexical tokens) attributes
    can be calculated by the lexer
  • Each production may have an action
  • Written as X ? Y1 Yn action
  • That can refer to or compute symbol attributes

8
Semantic Actions An Example
  • Consider the grammar
  • E ? int E E ( E )
  • For each symbol X define an attribute X.val
  • For terminals, val is the associated lexeme
  • For non-terminals, val is the expressions value
    (and is computed from values of subexpressions)
  • We annotate the grammar with actions
  • E ? int E.val int.val
  • E1 E2 E.val E1.val
    E2.val
  • ( E1 ) E.val E1.val

9
Semantic Actions An Example (Cont.)
  • String 5 (2 3)
  • Tokens int5 ( int2 int3 )
  • Productions Equations
  • E ? E1 E2 E.val
    E1.val E2.val
  • E1 ? int5 E1.val
    int5.val 5
  • E2 ? ( E3) E2.val E3.val
  • E3 ? E4 E5 E3.val E4.val
    E5.val
  • E4 ? int2 E4.val
    int2.val 2
  • E5 ? int3 E5.val
    int3.val 3

10
Semantic Actions Notes
  • Semantic actions specify a system of equations
  • Order of resolution is not specified
  • Example
  • E3.val E4.val E5.val
  • Must compute E4.val and E5.val before E3.val
  • We say that E3.val depends on E4.val and E5.val
  • The parser must find the order of evaluation

11
Dependency Graph

E
  • Each node labeled E has one slot for the val
    attribute
  • Note the dependencies


E2
E1


int5
5
(
E3
)


E4

E5

int2
2
int3
3
12
Evaluating Attributes
  • An attribute must be computed after all its
    successors in the dependency graph have been
    computed
  • In previous example attributes can be computed
    bottom-up
  • Such an order exists when there are no cycles
  • Cyclically defined attributes are not legal

13
Semantic Actions Notes (Cont.)
  • Synthesized attributes
  • Calculated from attributes of descendents in the
    parse tree
  • E.val is a synthesized attribute
  • Can always be calculated in a bottom-up order
  • Grammars with only synthesized attributes are
    called S-attributed grammars
  • Most frequent kinds of grammars

14
Semantic Actions Top-down Approach
  • Recursive-descent interpreter
  • Consider this grammar
  • S -gt E
  • E -gt T E E-gt T E E -gt - T E
    E-gt
  • T -gt F T T -gt F T T -gt / F T T
    -gt
  • F -gt id F -gt num F -gt ( E )
  • Needs type of non-terminals and tokens

15
Recursive-descent interpreter
  • int T() switch (tok.kind)
  • case ID case NUM case LPAREN
  • return Tprime( F() )
  • defaultprint(expected ID, NUM, or
    left-paren)
  • skipto(T_follow) return 0
  • int Tprime(int a) switch (tok.kind)
  • case TIMES eat(TIMES) return
    Tprime(aF())
  • case DIVIDE eat(DIVIDE) return
    Tprime(a/F())
  • case PLUS case MINUS case RPAREN case
    EOF
  • return a
  • default / error handling /

16
JavaCC version
  • Grammar
  • S -gt E
  • E -gt T ( T - T)
  • T -gt F ( F - F)
  • F -gt id num ( E )
  • Note
  • E gt T E E -gt T E - T E e

17
JavaCC version
  • void Start()
  • int i
  • iExp() ltEOFgt System.out.println(i)
  • int Exp()
  • int a, i
  • aTerm() ( iTerm() aai
  • - iTerm() aai )
  • return a
  • Int Factor()
  • Token t int i
  • t ltIDENTIFIER gt return lookup(t.image)
  • tltINTEGER_LITERALgt return Integer.parseInt(t.
    image)
  • ( iExp() ) return i

18
Semantic Actions Reduce and Shift
  • We can now illustrate how semantic actions are
    implemented for LR parsing
  • Keep attributes on the stack
  • On shift a, push attribute for a on stack
  • On reduce X a
  • pop attributes for a
  • compute attribute for X
  • and push it on the stack

19
Performing Semantic Actions. Example
  • Recall the example from previous lecture
  • E T E1 E.val T.val E1.val
  • T E.val T.val
  • T int T1 T.val int.val T1.val
  • int T.val int.val
  • Consider the parsing of the string 3 5 8

20
Performing Semantic Actions. Example
  • int int int shift
  • int3 int int shift
  • int3 int int shift
  • int3 int5 int reduce T
    int
  • int3 T5 int reduce T
    int T
  • T15 int shift
  • T15 int shift
  • T15 int8 reduce T
    int
  • T15 T8 reduce E
    T
  • T15 E8 reduce E
    T E
  • E23 accept

21
Inherited Attributes
  • Another kind of attribute
  • Calculated from attributes of parent and/or
    siblings in the parse tree
  • Example a line calculator

22
A Line Calculator
  • Each line contains an expression
  • E ? int E E
  • Each line is terminated with the sign
  • L ? E E
  • In second form the value of previous line is used
    as starting value
  • A program is a sequence of lines
  • P ? ? P L

23
Attributes for the Line Calculator
  • Each E has a synthesized attribute val
  • Calculated as before
  • Each L has a synthesized attribute val
  • L ? E L.val E.val
  • E L.val E.val L.prev
  • We need the value of the previous line
  • We use an inherited attribute L.prev

24
Attributes for the Line Calculator (Cont.)
  • Each P has a synthesized attribute val
  • The value of its last line
  • P ? ? P.val 0
  • P1 L P.val L.val
  • L.prev P1.val
  • Each L has an inherited attribute prev
  • L.prev is inherited from sibling P1.val
  • Example

25
Example of Inherited Attributes

P
  • val synthesized
  • prev inherited
  • All can be computed in depth-first order



L


P



E3

?
0

E4

E5

2
int2
int3
3
26
Semantic Actions Notes (Cont.)
  • Semantic actions can be used to build ASTs
  • And many other things as well
  • Also used for type checking, code generation,
  • Process is called syntax-directed translation
  • Substantial generalization over CFGs

27
Constructing An AST
  • We first define the AST data type
  • Supplied by us for the project
  • Consider an abstract tree type with two
    constructors

n
mkleaf(n)

PLUS
mkplus(
)
,
T1
T2
T1
T2
28
Constructing a Parse Tree
  • We define a synthesized attribute ast
  • Values of ast values are ASTs
  • We assume that int.lexval is the value of the
    integer lexeme
  • Computed using semantic actions
  • E ? int E.ast mkleaf(int.lexval)
  • E1 E2 E.ast mkplus(E1.ast,
    E2.ast)
  • ( E1 ) E.ast E1.ast

29
Parse Tree Example
  • Consider the string int5 ( int2 int3
    )
  • A bottom-up evaluation of the ast attribute
  • E.ast mkplus(mkleaf(5),

  • mkplus(mkleaf(2), mkleaf(3))

30
Review
  • We can specify language syntax using CFG
  • A parser will answer whether s ? L(G)
  • and will build a parse tree
  • which we convert to an AST
  • and pass on to the rest of the compiler

31
Abtract Parse Trees Expression Grammar
  • Abstract Syntax
  • E -gt E E
  • E -gt E E
  • E -gt E E
  • E -gt E / E
  • E -gt id
  • E -gt num

32
AST Node types
  • public abstract class Exp
  • public abstract int eval()
  • public class PlusExp extends Exp
  • private Exp e1, e2
  • public PlusExp(Exp a1, Exp a2) e1a1 d2a2
  • public int eval()
  • return e1.eval()e2.eval()
  • public class Identifier extends Exp private
    String f0
  • public Indenfifier(String n0) f0 n0
  • public int eval()
  • return lookup(f0)
  • public class IntegerLiteral extends Exp private
    String f0
  • public IntegerLiteral(String n0) f0 n0
  • public int eval()

33
JavaCC Example for AST construction
  • Exp Start()
  • Exp e
  • eExp() return e
  • Exp Exp()
  • Exp e1, e2
  • e1Term() ( e2Term() e1new
    PlusExp(e1,e2)
  • - e2Term() e1new
    MinusExp(e1,e2) )
  • return a
  • Exp Factor()
  • Token t Exp e
  • t ltIDENTIFIER gt return new
    Identifier(t.image)
  • tltINTEGER_LITERALgt
  • return new IntegerLiteral(t.ima
    ge)
  • ( eExp() ) return e

34
Positions
  • Must remember the position in the source file
  • Lexical analysis, parsing and semantic analysis
    are not done simultaneously.
  • Necessary for error reporting
  • AST must keep the pos fields, which indicate the
    position within the original source file.
  • Lexer must pass the information to the parser.
  • Ast node constructors must be augmented to init
    the pos fields.

35
JavaCC Class Token
  • Each Token object has the following fields
  • int kind
  • int beginLine, beginColumn, endLine, endColumn
  • String image
  • Token next
  • Token specialToken
  • static final Token newToken(int ofKind)
  • Unfortunately, .

36
Visitors
  • syntax separate from interpretation style of
    programming
  • Vs. object-oriented style of programming
  • Visitor pattern
  • Visitor implements an interpretation.
  • Visitor object contains a visit method for each
    syntax-tree class.
  • Syntax-tree classes contain accept methods.
  • Visitor calls accept(what is your class?). Then
    accept calls the visit of the visitor.

37
Example Expression Classes
  • public abstract class Exp
  • public abstract int accept(Visitor v)
  • public class PlusExp extends Exp
  • private Exp e1, e2
  • public PlusExp(Exp a1, Exp a2) e1a1 d2a2
  • public int accept(Visitor v) return
    v.visit(this)
  • public class Identifier extends Exp private
    String f0
  • public Indenfifier(String n0) f0 n0
  • public int accept(Visitor v) return
    v.visit(this)
  • public class IntegerLiteral extends Exp private
    String f0
  • public IntegerLiteral(String n0) f0 n0
  • public int accept(Visitor v) return
    v.visit(this)

38
An interpreter visitor
  • public interface Visitor
  • public int visit(PlusExp n)
  • public int visit(Identifier n)
  • public int visit(IntegerLiteral n)
  • public class Interpreter implements Visitor
  • public int visit(PlusExp n)
  • return n.e1.accept(this) n.e2.accept(this)
  • public int visit(Identifier n)
  • return looup(n.f0)
  • public int visit(IntegerLiteral n)
  • return Integer.parseInt(n.f0)

39
Abstract Syntax for MiniJava (I)
  • Package syntaxtree
  • Program(MainClass m, ClassDecList c1)
  • MainClass(Identifier i1, Identifier i2, Statement
    s)
  • ----------------------------
  • abstract class ClassDecl
  • ClassDeclSimple(Identifier i, VarDeclList vl,
  • methodDeclList m1)
  • ClassDeclExtends(Identifier i, Identifier j,
  • VarDecList vl, MethodDeclList
    ml)
  • -----------------------------
  • VarDecl(Type t, Identifier i)
  • MethodDecl(Type t, Identifier I, FormalList fl,
  • VariableDeclList vl, StatementList sl,
    Exp e)
  • Formal(Type t, Identifier i)

40
Abstract Syntax for MiniJava (II)
  • abstract class type
  • IntArrayType()
  • BooleanType()
  • IntegerType()
  • IndentifierType(String s)
  • ---------------------------
  • abstract class Statement
  • Block(StatementList sl)
  • If(Exp e, Statement s1, Statement s2)
  • While(Exp e, Statement s)
  • Print(Exp e)
  • Assign(Identifier i, Exp e)
  • ArrayAssign(Identifier i, Exp e1, Exp e2)
  • -------------------------------------------

41
Abstract Syntax for MiniJava (III)
  • abstract class Exp
  • And(Exp e1, Exp e2) LessThan(Exp e1, Exp
    e2)
  • Plus(Exp e1, Exp e2) Minus(Exp e1, Exp
    e2)
  • Times(Exp e1, Exp e2) Not(Exp e)
  • ArrayLookup(Exp e1, Exp e2) ArrayLength(Exp e)
  • Call(Exp e, Identifier i, ExpList el)
  • IntergerLiteral(int i)
  • True() False()
  • IdentifierExp(String s)
  • This()
  • NewArray(Exp e) NewObject(Identifier
    i)
  • -------------------------------------------------
  • Identifier(Sting s)
  • --list classes-------------------------
  • ClassDecList() ExpList() FormalList()
    MethodDeclList()
  • StatementLIst() VarDeclList()

42
Syntax Tree Nodes - Details
  • package syntaxtree
  • import visitor.Visitor
  • import visitor.TypeVisitor
  • public class Program
  • public MainClass m
  • public ClassDeclList cl
  • public Program(MainClass am, ClassDeclList acl)
  • mam clacl
  • public void accept(Visitor v)
  • v.visit(this)
  • public Type accept(TypeVisitor v)
  • return v.visit(this)

43
ClassDecl.java
  • package syntaxtree
  • import visitor.Visitor
  • import visitor.TypeVisitor
  • public abstract class ClassDecl
  • public abstract void accept(Visitor v)
  • public abstract Type accept(TypeVisitor v)

44
ClassDeclExtends.java
  • package syntaxtree
  • import visitor.Visitor
  • import visitor.TypeVisitor
  • public class ClassDeclExtends extends ClassDecl
  • public Identifier i
  • public Identifier j
  • public VarDeclList vl
  • public MethodDeclList ml
  • public ClassDeclExtends(Identifier ai,
    Identifier aj,
  • VarDeclList avl, MethodDeclList
    aml)
  • iai jaj vlavl mlaml
  • public void accept(Visitor v)
  • v.visit(this)
  • public Type accept(TypeVisitor v)
  • return v.visit(this)

45
StatementList.java
  • package syntaxtree
  • import java.util.Vector
  • public class StatementList
  • private Vector list
  • public StatementList()
  • list new Vector()
  • public void addElement(Statement n)
  • list.addElement(n)
  • public Statement elementAt(int i)
  • return (Statement)list.elementAt(i)
  • public int size()
  • return list.size()

46
Package Visitor/visitor.java
  • package visitor
  • import syntaxtree.
  • public interface Visitor
  • public void visit(Program n) public void
    visit(MainClass n)
  • public void visit(ClassDeclSimple n) public
    void visit(ClassDeclExtends n)
  • public void visit(VarDecl n) public void
    visit(MethodDecl n)
  • public void visit(Formal n) public void
    visit(IntArrayType n)
  • public void visit(BooleanType n) public void
    visit(IntegerType n)
  • public void visit(IdentifierType n) public
    void visit(Block n)
  • public void visit(If n) public void
    visit(While n)
  • public void visit(Print n) public void
    visit(Assign n)
  • public void visit(ArrayAssign n) public void
    visit(And n)
  • public void visit(LessThan n) public void
    visit(Plus n)
  • public void visit(Minus n) public void
    visit(Times n)
  • public void visit(ArrayLookup n) public void
    visit(ArrayLength n)
  • public void visit(Call n) public void
    visit(IntegerLiteral n)
  • public void visit(True n) public void
    visit(False n)
  • public void visit(IdentifierExp n) public
    void visit(This n)

47
X y.m(1,45)
  • Statement -gt AssignmentStatement
  • AssignmentStatement -gt Identfier1 Expression
  • Identifier1 -gt ltIDENTIFIERgt
  • Expression -gt Expression1 . Identifier2 ( (
    ExpList)? )
  • Expression1 -gt IdentifierExp
  • IdentifierExp -gt ltIDENTIFIERgt
  • Identifier2 -gt ltIDENTIFIERgt
  • ExpList -gt Expression2 ( , Expression3 )
  • Expression2 -gt ltINTEGER_LITERALgt
  • Expression3 -gt PlusExp -gt Expression
    Expression
  • -gt ltINTEGER_LITERALgt ,
    ltINTEGER_LITERALgt

48
AST
Statement s -gt
Assign (Identifier,Exp)
Identifier(x)
Call(Exp,Identifier,ExpList)
init
IdentifierExp(y)
Identifier(m)
ExpList e1
add
IntegerLiteral(1)
add
Plus(Exp,Exp)
(IntegerLiteral(5)
IntegerLiteral(4)
49
MiniJava Grammar(I)
  • Program -gt MainClass ClassDecl
  • Program(MainClass, ClassDeclList)
  • Program Goal()
  • MainClass m ClassDeclList cl new
    ClassDeclList()
  • ClassDecl c
  • m MainClass() (c ClassDecl()
    cl.addElement(c))
  • ltEOFgt return new Program(m,cl)

50
MiniJava Grammar(II)
  • MainClass -gt class id public static void main
    ( String id )  
  •       Statement
  • MainClass(Identifier, VarDeclList)
  • ClassDecl -gt class id VarDecl MethodDecl
  • -gt class id extends id
    VarDecl MethodDecl
  • ClassDeclSimple(),
    ClassDecExtends()
  • VarDecl -gt Type id
  • VarDecl(Type, Identifier)
  • MethodDecl -gt public Type id ( FormalList )
  •        VarDecl
    Statement return Exp
  • MethodDecl(Type,Identifier,FormalList,VarD
    eclList
  • StaementList, Exp)

51
MiniJava Grammar(III)
  • FormalList -gt Type id FormalRest
  • -gt
  • FormalRest -gt , Type id
  • Type -gt int
  • -gt   boolean
  • -gt   int
  • -gt   id

52
MiniJava Grammar(IV)
  • Statement -gt Statement  
  • -gt if ( Exp ) Statement
    else Statement
  • -gt  while ( Exp ) Statement  
  • -gt System.out.println ( Exp
    )  
  • -gt id Exp
  • -gt  id Exp Exp
  • ExpList -gt Exp ExpRest
  • -gt
  •  ExpRest -gt , Exp

53
MiniJava Grammar(V)
  • Exp -gt Exp op Exp
  • -gt  Exp Exp  
  • -gt Exp . length
  • -gt   Exp . Id ( ExpList )  
  • -gt INTEGER_LITERAL 
  • -gt true
  • -gt false 
  • -gt id 
  • -gt this
  • -gt new int Exp
  • -gt new id ( )
  • -gt   ! Exp
  • -gt   ( Exp )

54
References
  • Andrew W. Appel, Modern Compiler Implementation
    in Java (2nd Edition), Cambridge University
    Press, 2002
  • http//compiler.kaist.ac.kr/courses/cs420/classtps
    /Chapter05.pps
  • Modern Compiler Design, Scott Galles, Scott Jones
Write a Comment
User Comments (0)
About PowerShow.com