Parsing I - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Parsing I

Description:

An Exercise... Can you create a regular expression to determine if brackets are matched? ... ab* (a|b)* Grammars to General Parsers. Parser. Context-free ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 38
Provided by: cbo4
Category:
Tags: parsing

less

Transcript and Presenter's Notes

Title: Parsing I


1
Parsing I
Please read 4.1 to 4.3
  • Parsing vs. Scanning
  • Context Free Grammars
  • Bottom-up and Top-down parsing

Token Stream
Parser
2
Recall the Structure of a Compiler
character stream
Lexical Analysis
token stream
Parsing
Front End
syntax tree
Semantic Analysis
syntax tree
Intermediate Code Generate
Symbol Table
intermediate code
Optimization
Back End
intermediate code
Code Generation
target machine code
3
Heres where we are now
character stream
Lexical Analysis
token stream
Parsing
Front End
syntax tree
Semantic Analysis
syntax tree
Intermediate Code Generate
Symbol Table
intermediate code
Optimization
Back End
intermediate code
Code Generation
target machine code
4
An Exercise
Can you create a regular expression to determine
if brackets are matched? Good Bad

TT
5
Unbounded counting
Can you create a regular expression to determine
if brackets are matched? Good Bad

We would need to be able to count any number of
parenthesis to determine a match. We cant do
unbounded counting? (Why not?)
6
Regular Expressions
Regular expressions lack the expressive power to
specify syntax! So We cant use them to
describe syntax!
7
Context-free Grammars (CFGs)
  • An important class of formal grammars
  • Generally sufficient to express the syntax of
    modern programming languages
  • Efficient implementations exists
  • O(n3) worst case time.
  • O(n) time for most grammars.
  • Grammar will be described using rules which then
    map to the nodes of syntax trees.

8
A C Example
A simple C program fragment do i
while(i lt 1000)
Whats a reasonable syntax tree?
TT
9
Example Syntax Tree
A simple C program fragment do i
while(i lt 1000)
do-while

lt
ltid,1gt
1000

ltid,1gt
do statement while ( expression )
This is a way we might describe this type of
statement (less the expressions and nested
statement)
10
Definition of a Context-free Grammar
  • A set of terminal symbols (tokens). These are the
    elementary symbols of the language defined by the
    grammar.
  • A set of nonterminal symbols (syntactic
    variables). Each nonterminal represents a set of
    strings of terminals.
  • A set of productions, where each production
    consists of a nonterminal called the head or left
    side of the production, an arrow, and a sequence
    of terminals and/or nonterminals called the body.
  • A designation of one nonterminal as the start
    symbol.

11
Just suppose
What are the possible statements if this is all
our language could do? do i while(i lt
1000) What are the possible expressions?
Statements are lines of code. Well consider a
program to be statement statement statement etc.
An expression is a function that returns a value.
12
Just suppose
What are the possible statements if this is all
our language could do? do i while(i lt
1000) What are the possible expressions?
do statement while ( expression ) id id lt
num
Statements
Expressions
13
Definition of a Context-free Grammar
  • A set of terminal symbols (tokens). These are the
    elementary symbols of the language defined by the
    grammar.
  • do while ( ) lt id num ?

Epsilon, means nothing.
do statement while ( expression ) id id lt
num
Statements
Expressions
14
Our start nonterminal
4. A designation of one nonterminal as the start
symbol.
program ? ????
do statement while ( expression ) id id lt
num
Statements
Expressions
15
Our start nonterminal
4. A set of productions, where each production
consists of a nonterminal called the head or left
side of the production, an arrow, and a sequence
of terminals and/or nonterminals called the body.
program ? statements
Pretty easy, huh? Whats the production for
statements?
Production
do statement while ( expression ) id id lt
num
Statements
Expressions
16
Statements
program ? statements statements ? statement
statements ?
Theres no like in regular expressions. So, we
have either a statement followed by more
statements or we have nothing. How does i
j k fit into these productions?
do statement while ( expression ) id id lt
num
Statements
Expressions
17
Statement
program ? statements statements ? statement
statements ? statement ? do statement while (
expression ) id
statements
Expressions?
do statement while ( expression ) id id lt
num
Statements
Expressions
18
Statement
program ? statements statements ? statement
statements ? statement ? do statement while (
expression ) id
statements
expression ? id lt num
This is a complete grammar for this minimum
language.
do statement while ( expression ) id id lt
num
Statements
Expressions
TT
19
Statement
program ? statements statements ? statement
statements ? statement ? do statement while (
expression )
expression
statements expression ? id lt num id
do statement while ( expression ) id id lt
num
Statements
Expressions
20
Balanced Parenthesis CFG
program ? S S? ( S) S ?
If a grammar accepts a string, there exists at
least one derivation of that string using the
productions one at a time.
() program ? S? ( S) S ? ( ) S ? ( )
(()) program ? S ? ( S ) S ? ( ( S ) S ) S ? (
( ) ) S ? ( ( ) ) ()()
21
Expressions
What about more general expressions including ,
, and parenthesis and any mix of numbers and ids?
a b a b b c d b c d b c d a (12
c) 5 ((a b) (c d) 4)
22
Expressions
expression ? expression t t t ? t f
f f ? ( expression ) id num
Create a parse tree for each one. Each
production becomes an interior node of the tree.
a b a b b c d b c d b c d a (12
c) 5 ((a b) (c d) 4)
TT
23
Is a regular language also a context-free
language?
For any given regular expression, can be
construct a context-free grammar that accepts the
same strings?
24
Yep, Regular Languages are Context-Free Languages
e a a b a b a
? s ? a s ? a b s ? a b s ? s1 s1 ? s1 s1
?
Just apply these rules recursively and you can
convert a regular expression to a context-free
grammar. (what about parenthesis?)
a ab (ab)
TT
25
Grammars to General Parsers
Parser
Context-free Grammar G
Yes, if s in L(G) No, otherwise
Token stream s
Error messages
A general parser (syntax analyzer) indicates if a
token stream is in a given grammar. Its an
acceptor. Syntax trees are a (useful) side
effect its easy to add.
26
Parsing Methods
Top-down
We start with the start symbol and expand from
there. We build the tree from the root down to
the leaves.
Bottom-up
We start from tokens (leaves of the tree) and
build the tree upward.
Top-down parsers are easy to write, but place
restrictions on the grammar. Bottom-up parsers
are usually machine generated, but can
accommodate a larger set of grammars.
27
Bottom-up Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num
Tokenized
28
Bottom-up Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num ( f f ) f
Apply 3 times f ? ( e ) id num
29
Bottom-up Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num ( f f ) f (
t f ) f
Apply t ? t f f
30
Bottom-up Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num ( f f ) f (
t f ) f (e f ) f
Apply e ? e t t
31
Bottom-up Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num ( f f ) f (
t f ) f (e f ) f ( e ) f f f t f t e
32
Top-down parsing
Begin with the start symbol. For each iteration,
replace one nonterminal with a production. Keep
going until we construct the input stream.
33
Top-down Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num e
Tokenized
Begin with start symbol
34
Top-down Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num e t
Apply e ? e t t
35
Top-down Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num e t t f
Apply t ? t f f
36
Top-down Parsing
e ? e t t t ? t f f f ? ( e ) id
num
(2 3) 7 ( num num ) num e t t f
Apply t ? t f f
37
Top-down Parsing
(2 3) 7 ( num num ) num e t t f f
f ( e ) f ( e t) f ( t t ) f ( f t )
f ( num t ) f ( num f ) f ( num num )
f ( num num ) num
e ? e t t t ? t f f f ? ( e ) id
num
Write a Comment
User Comments (0)
About PowerShow.com