Abstract Syntax - PowerPoint PPT Presentation

About This Presentation
Title:

Abstract Syntax

Description:

Scanner returns 'semantic values' for some tokens ... { swith(tok) { case $: return left; case : eat( ); temp=tok.val; eat(num) ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 21
Provided by: thoma423
Category:
Tags: abstract | swith | syntax

less

Transcript and Presenter's Notes

Title: Abstract Syntax


1
Abstract Syntax
  • Mooly Sagiv
  • Schrierber 317
  • 03-640-7606
  • Wed 1000-1200
  • html//www.cs.tau.ac.il/msagiv/courses/wcc02.html

2
Outline
  • The general idea
  • Bison
  • Motivating example Interpreter for arithmetic
    expressions
  • The need for abstract syntax
  • Abstract syntax for Straight-line code
  • Abstract syntax for Tiger (Targil)

3
Semantic Analysis during Recursive Descent
Parsing
  • Scanner returns semantic values for some tokens
  • The function of every non-terminal returns the
    corresponding subtree value
  • When A B C D is appliedthe function for A
    can use the values returned by B, C, and D
  • The function can also pass parameters, e.g., to
    D(), reflecting left contexts

4
int E() swith(tok) case num
temptok.val eat(num) return EP(temp)
default error()
E num E E empty-string E num E
int EP(int left) swith(tok) case
return left case eat()
temptok.val eat(num) return EP(left
temp) default error()
5
Semantic Analysis during Bottom-Up Parsing
  • Scanner returns semantic values for some tokens
  • Use parser stack to store the corresponding
    subtree values
  • When A B C D is reducedthe function for A
    can use the values returned by B, C, and D
  • No action in the middle of the rule

6
Example
num 5
E E num E num

E 7
E 12
7
Bison Specification
Declarations Productions C -Routines
8
Interpreter (in Bison)
declarations of yylex() and yyeror()
union int num string
id token ltidgt ID token ltnumgt NUM type ltnumgt
e f t start e
e e t 1 3 e - t
1 - 3 t 1
t t f 1 3 t
/ f 1 / 3 f 1
f NUM 1 ID
lookup(1) - e - 2
( e ) 2
9
Interpreter (compact spec.)
declarations of yylex() and
yyeror() union int num
string id token ltidgt ID token ltnumgt NUM type
ltnumgt e start e left PLUS MINUS left MUL
DIV right UMINUS
e e PLUS e 1 3 e MINUS e
1 - 3 e MUL e 1 3
e DIV e 1 / 3 NUM
ID lookup(1) MINUS e
prec UMINUS - 2 ( e )
2
10
(No Transcript)
11
stack
input
action
e 11 e 7
17
reduce e ee
12
stack
input
action
e 17 e 18

reduce e ee
13
So why cant we write all the compiler code in
Bison?
14
typdef struct table Table_ typedef Table_
struct string id, int value, Table _tail Table_
Table(string id, int value, struct table
tail) Table_ tableNULL int lookup(Table_
table, string id) assert(table!NULL) if
(idtable.id) return table.value else
return lookup(table.tail, id) void
update(Table_ tabptr, string id, int value)
tabptr Table(id, value, tabptr) union
int num string id token ltnumgt INT token
ltidgt ID token ASSIGN PRINT LPAREN RPAREN type
ltnumgt exp left SEMICOLUMN COMMA left PLUS
MINUS left TIMES DIV start prog
prog stm stm stm SEMICOLUMN stm
ID ASSIGN exp update(table, 1, 3)
PRINT LPAREN exps RPAREN printf(\n)
exps exp printf(d, 1)
exps COMMA exp printf(d, 3)
exp INT 1 ID
lookup(table, 1) exp PLUS exp
1 3 exp MINUS exp
1 - 3 exp TIMES exp 1
3 exp DIV exp 1 / 3
stm COMMA exp 3 ( exp
) 2
15
Historical Perspective
  • Originally parsers were written w/o tools
  • yacc, bison, ... make tools acceptable
  • But it is still difficult to write compilers in
    parser actions (top-down and bottom-up)
  • Natural grammars are ambiguous
  • No modularity principle
  • Many useful programming language features prevent
    code generation while parsing
  • Use before declaration
  • gotos

16
Abstract Syntax
  • Intermediate program representation
  • Defines a tree - Preserves program hierarchy
  • Generated by the parser
  • Declared using an (ambiguous) context free
    grammar (relatively flat) Not meant for parsing
  • Keywords and punctuation symbols are not stored
    (Not relevant once the tree exists)
  • Big programs can be also handled (possibly via
    virtual memory)

17
Issues
  • Concrete vs. Abstract syntax tree
  • Need to store concrete source position
  • Abstract syntax can be defined by
  • Ambiguous context free grammar
  • C recursive data type
  • Constructor functions
  • Debugging routines linearize the tree

18
Abstract Syntax for Straight-line Program
19
include absyn.h union int num
string id A_stm stm
A_exp exp A_expList
expList token ltnumgt INT token ltidgt ID token
ASSIGN PRINT LPAREN
RPAREN type ltnumgt exp left
SEMICOLUMN COMMA left PLUS MINUS left
TIMES DIV start prog
prog stm 1 stm stm
SEMICOLUMN stm A_CompoundStm(1, 3)
ID ASSIGN exp A_AssignStm(1, 3)
PRINT LPAREN exps RPAREN
A_PrintStm(3) exps exp
A_ExpList(1, NULL) exps COMMA exp
A_ExpList(1, 3) exp
INT A_NumExp(1) ID A_IdExp(
1) exp PLUS exp A_OpExp(1,
A_Plus, 3) exp MINUS exp
A_OpExp(1, A_Minus, 3) exp TIMES
exp A_OpExp(1, A_Time, 3)
exp DIV exp A_OpExp(1, A_Div, 3)
exp COMMA exp A_EseqExp(1, 3)
( exp ) 2
20
Summary
  • Flex and Bison simplify the task of writing
    compiler/interpreter front-ends
  • Abstract syntax provides a clear interface with
    other compiler phases
  • Supports general programming languages
  • But the design of an abstract syntax for a given
    PL may take some time
Write a Comment
User Comments (0)
About PowerShow.com