Grammars for Syntax Definition - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Grammars for Syntax Definition

Description:

else if lookahead = ' then begin match ( ' ) ; match( id ) end ... returns token to caller. tokenval. Sets global variable to attribute value. lexan ( ) lexical ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 28
Provided by: steve805
Category:

less

Transcript and Presenter's Notes

Title: Grammars for Syntax Definition


1
Grammars for Syntax Definition
  • A Context-free Grammar (CFG) Is Utilized to
    Describe the Syntactic Structure of a Language
  • A CFG Is Characterized By
  • 1. A Set of Tokens or Terminal Symbols
  • 2. A Set of Non-terminals
  • 3. A Set of Production Rules Each Rule Has the
    FormNT ? T, NT
  • 4. A Non-terminal Designated As the Start Symbol

2
Grammars for Syntax DefinitionExample CFG
list ? list digit list ? list - digit list ?
digit digit ? 0 1 2 3 4 5 6 7 8
9 (the means OR) (So we could have
written list ? list digit list - digit
digit )
3
Grammars are Used to Derive Strings
Using the CFG defined on the previous slide, we
can derive the string 9 - 5 2 as
follows list ? list digit ? list -
digit digit ? digit - digit digit
? 9 - digit digit ? 9 - 5 digit
? 9 - 5 2
P1 list ? list digit P2 list ? list -
digit P3 list ? digit P4 digit ? 9 P4
digit ? 5 P4 digit ? 2
4
Grammars are Used to Derive Strings
This derivation could also be represented via a
Parse Tree
list ? list digit ? list - digit
digit ? digit - digit digit ? 9
- digit digit ? 9 - 5 digit ?
9 - 5 2
5
A More Complex Grammar
block ? begin opt_stmts end opt_stmts ?
stmt_list ? stmt_list ? stmt_list stmt
stmt
What is this grammar for ? What does ?
represent ? What kind of production rule is this ?
6
Defining a Parse Tree
  • More Formally, a Parse Tree for a CFG Has the
    Following Properties
  • Root Is Labeled With the Start Symbol
  • Leaf Node Is a Token or ?
  • Interior Node (Now Leaf) Is a Non-Terminal
  • If A ? x1x2xn, Then A Is an Interior
    x1x2xn Are Children of A and May Be
    Non-Terminals or Tokens

7
Other Important Concepts Ambiguity
Two derivations (Parse Trees) for the same token
string.
Grammar string ? string string string
string 0 1 9
Why is this a Problem ?
8
Other Important Concepts Associativity of
Operators
Left vs. Right
right ? letter right letter letter ? a b
c z
9
Other Important Concepts Operator Precedence
What does 9 5 2 mean?
( ) / -
is precedence order
Typically
This can be incorporated into a grammar via
rules
expr ? expr term expr term term term ?
term factor term / factor factor factor ?
digit ( expr ) digit ? 0 1 2 3 9
Precedemce Achieved by expr term for each
precedence level Rules for each are left
recursive or associate to the left
10
Syntax-Directed Translation
  • Associate Attributes With Grammar Rules
    Constructs and Translate As Parsing Occurs
  • Our Example Uses Infix to Postfix Notation
    Translation for Expressions
  • Translation May Be Defined Inductively As
    Postfix(e), E is an Expression

If E e1 op e2 then postfix(E) postfix(e1)
postfix(e2) op If E (e) then postfix(E)
postfix(e) If E x then postfix(E) x
( 9 5 ) 2 ? 9 5 2 9 ( 5 2 ) ? 9 5
2 -
Examples
11
Syntax-Directed Definition (2 parts)
  • Each Production Has a Set of Semantic Rules
  • Each Grammar Symbol Has a Set of Attributes
  • For the Following Example, String Attribute t
    is Associated With Each Grammar Symbol, i.e.,
  • What is a Derivation for 9 5 - 2?

12
Syntax-Directed Definition (2 parts)
  • Each Production Rule of the CFG Has a Semantic
    Rule
  • Note Semantic Rules for expr Use Synthesized
    Attributes Which Obtain Their Values From Other
    Rules.

13
Semantic Rules are Embedded in Parse Tree
  • How Do Semantic Rules Work ?
  • What Type of Tree Traversal is Being Performed?
  • How Can We More Closely Associate Semantic Rules
    With Production Rules ?

14
Examples
rest ? term rest ? rest ? term
print()rest (Print After term for
postfix translation)
15
Parsing Top-Down Predictive
  • Top-Down Parsing ? Parse tree / derivation of
    a token string occurs in a top down fashion.
  • For Example, Consider

type ? simple ? id
array simple of type simple ? integer
char num dotdot num
Suppose input is array num dotdot num
of integer The parse would begin with type ?
array simple of type
16
Top-Down Parse (type start symbol)
Input array num dotdot num of integer
Tokens
17
Top-Down Parse (type start symbol)
Input array num dotdot num of integer
18
Top-Down Process Recursive Descent or Predictive
Parsing
  • Parser Operates by Attempting to Match Tokens in
    the Input Stream
  • Utilize both Grammar and Input Below to Motivate
    Code for Algorithm

array num dotdot num of integer
type ? simple ? id
array simple of type simple ? integer
char num dotdot num
procedure match ( t token ) begin
if lookahead t then
lookahead nexttoken else
error end
19
Top-Down Algorithm (Continued)
procedure type begin if lookahead
is in integer, char, num then simple
else if lookahead ? then begin match
(? ) match( id ) end else if
lookahead array then begin
match( array ) match() simple match()
match(of) type end
else error end procedure simple
begin if lookahead integer then
match ( integer ) else if lookahead
char then match ( char ) else
if lookahead num then begin
match (num) match (dotdot) match
(num) end
else error end
20
Problem with Top Down Parsing
  • Left Recursion in CFG May Cause Parser to Loop
    Forever
  • Solution Algorithm to Remove Left Recursion

expr ? expr term expr - term term term
? 0 1 2 3 4 5 6 7 8 9
expr ? term rest rest ? term rest - term
rest ? term ? 0 1 2 3 4 5 6 7
8 9

New Semantic Actions !
rest ? term print() rest - term
print(-) rest ?
21
Comparing Grammarswith Left Recursion
  • Notice Location of Semantic Actions in Tree
  • What is Order of Processing?

22
Comparing Grammarswithout Left Recursion
  • Now, Notice Location of Semantic Actions in Tree
    for Revised Grammar
  • What is Order of Processing in this Case?

rest
23
The Lexical Analysis ProcessA Graphical Depiction
returns token to caller
uses getchar ( ) to read character
lexan ( ) lexical analyzer
pushes back c using ungetc (c , stdin)
tokenval
Sets global variable to attribute value
24
The Lexical Analysis ProcessFunctional
Responsibilities
  • Input Token String Is Broken Down
  • White Space and Comments Are Filtered Out
  • Individual Tokens With Associated Values Are
    Identified
  • Symbol Table Is Initialized and Entries Are
    Constructed for Each Appropriate Token
  • Under What Conditions will a Character be Pushed
    Back?
  • Can You Cite Some Examples in Programming
    Language Statements?

25
Algorithm for Lexical Analyzer
function lexan integer var lexbuf
array 0 .. 100 of char c
char begin loop begin
read a character into c
if c is a blank or a tab then
do nothing else if
c is a newline then
lineno lineno 1 else if
c is a digit then begin
set tokenval to the value of this and
following digits
return NUM end
26
Algorithm for Lexical Analyzer
else if c is a letter then
begin place c and
successive letters and digits into lexbuf
p lookup ( lexbuf )
if p 0 then
p iinsert ( lexbf,
ID) tokenval p
return the token field of
table entry p end
else / token is a single character /
set tokenval to NONE /
there is no attribute /
return integer encoding of character c
end end
Note Insert / Lookup operations occur against
the Symbol Table !
27
Symbol Table Considerations
OPERATIONS Insert (string, token_ID)
Lookup (string) NOTICE
Reserved words are placed into
symbol table for easy
lookup Attributes may be associated with each
entry, i.e.,
Semantic Actions
Typing Info id ? integer
etc.
ARRAY symtable lexptr
token attributes
div mod
id id
0 1 2 3 4




ARRAY lexemes
Write a Comment
User Comments (0)
About PowerShow.com