Parsing - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Parsing

Description:

Parsing Programming Language Concepts Lecture 6 Prepared by Manuel E. Berm dez, Ph.D. Associate Professor University of Florida – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 34
Provided by: Manuel176
Learn more at: https://www.cise.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: Parsing


1
Parsing
Programming Language Concepts Lecture 6
  • Prepared by
  • Manuel E. Bermúdez, Ph.D.
  • Associate Professor
  • University of Florida

2
Context-Free Grammars
  • Definition A context-free grammar (CFG) is a
    quadruple G (?, ?, P, S), where all productions
    are of the form A ? ?, for A ? ? and ? ? (?u? ).
  • Re-writing using grammar rules
  • ßA? gt ß?? if A ? ? (derivation).

3
String Derivations
  • Left-most derivation At each step, the
    left-most nonterminal is re-written.
  • Right-most derivation At each step, the
    right-most nonterminal is re-written.

4
(No Transcript)
5
Derivation Trees
  • Derivation trees
  • Describe re-writes, independently of the order
    (left-most or right-most).
  • Each tree branch matches a production rule in the
    grammar.

6
(No Transcript)
7
Derivation Trees
  • Notes
  • Leaves are terminals.
  • Bottom contour is the sentence.
  • Left recursion causes left branching.
  • Right recursion causes right branching.

8
Goal of Parsing
  • Examine input string, determine whether it's
    legal.
  • Equivalent to building derivation tree.
  • Added benefit tree embodies syntactic structure
    of input.
  • Therefore, tree should be unique.

9
Ambiguous Grammars
  • Definition A CFG is ambiguous if there exist
    two different right-most (or left-most, but not
    both) derivations for some sentence z.
  • (Equivalent) Definition A CFG is ambiguous if
    there exist two different derivation trees for
    some sentence z.

10
Ambiguous Grammars
  • Classic ambiguities
  • Simultaneous left/right recursion
  • E ? E E
  • ? i
  • Dangling else problem
  • S ? if E then S
  • ? if E then S else S
  • ?

11
(No Transcript)
12
Operator Precedence and Associativity
  • Lets build a CFG for expressions consisting of
  • elementary identifier i.
  • and - (binary ops) have lowest precedence, and
    are left associative .
  • and / (binary ops) have middle precedence, and
    are right associative.
  • and - (unary ops) have highest precedence, and
    are right associative.

13
Corresponding Grammar for Expressions
  • E ? E T E consists of T's,
  • ? E - T separated by s and 's
  • ? T (lowest precedence).
  • T ? F T T consists of F's,
  • ? F / T separated by 's and /'s
  • ? F (next precedence).
  • F ? - F F consists of a single P,
  • ? F preceded by 's and -'s.
  • ? P (next precedence).
  • P ? '(' E ')' P consists of a parenthesized
    E,
  • ? i or a single i (highest
    precedence).

14
Operator Precedence and Associativity
  • Operator precedence
  • The lower in the grammar, the higher the
    precedence.
  • Operator Associativity
  • Tie breaker for precedence.
  • Left recursion in the grammar means
  • left associativity of the operator,
  • left branching in the tree.
  • Right recursion in the grammar means
  • right associativity of the operator,
  • right branching in the tree.

15
Building Derivation Trees
  • Sample Input
  • - i - i ( i i ) / i i
  • (Human) derivation tree construction
  • Bottom-up.
  • On each pass, scan entire expression, process
    operators with highest precedence (parentheses
    are highest).
  • Lowest precedence operators are last, at the top
    of tree.

16
(No Transcript)
17
Abstract Syntax Trees
  • AST is a condensed version of the derivation
    tree.
  • No noise (intermediate nodes).
  • String-to-tree transduction grammar
  • rules of the form A ? ? gt 's'.
  • Build 's' tree node, with one child per tree from
    each nonterminal in ?.

18
Example
  • E ? E T gt
  • ? E - T gt -
  • ? T
  • T ? F T gt
  • ? F / T gt /
  • ? F
  • F ? - F gt neg
  • ? F gt
  • ? P
  • P ? '(' E ')'
  • ? i gt i

19

Sample Input - i - i ( i i ) / i i
20
String-to-Tree Transduction
  • We transduce from vocabulary of input symbols, to
    vocabulary of tree node names.
  • Could eliminate construction of unary node,
    anticipating semantics.
  • F ? - F gt neg
  • ? F // no more unary node
  • ? P

21
The Game of Syntactic Dominoes
  • The grammar
  • E ? ET T ? PT P ? (E)
  • ? T ? P ? i
  • The playing pieces An arbitrary supply of each
    piece (one per grammar rule).
  • The game board
  • Start domino at the top.
  • Bottom dominoes are the "input."

22
(No Transcript)
23
The Game of Syntactic Dominoes
  • Game rules
  • Add game pieces to the board.
  • Match the flat parts and the symbols.
  • Lines are infinitely elastic.
  • Object of the game
  • Connect start domino with the input dominoes.
  • Leave no unmatched flat parts.

24
Parsing Strategies
  • Same as for the game of syntactic dominoes.
  • Top-down parsing start at the start symbol,
    work toward the input string.
  • Bottom-up parsing start at the input string,
    work towards the goal symbol.
  • In either strategy, can process the input
    left-to-right ? or right-to-left ?

25
Top-Down Parsing
  • Attempt a left-most derivation, by predicting the
    re-write that will match the remaining input.
  • Use a string (a stack, really) from which the
    input can be derived.

26
Top-Down Parsing
  • Start with S on the stack.
  • At every step, two alternatives
  • ? (the stack) begins with a terminal t. Match t
    against the first input symbol.
  • ? begins with a nonterminal A. Consult an OPF
    (Omniscient Parsing Function) to determine which
    production for A would lead to a match with the
    first symbol of the input.
  • The OPF does the predicting in such a
    predictive parser.

27
(No Transcript)
28
Classical Top-Down Parsing Algorithm
  • Push (Stack, S)
  • while not Empty (Stack) do
  • if Top(Stack) ??
  • then if Top(Stack) Head(input)
  • then input tail(input)
  • Pop(Stack)
  • else error (Stack, input)
  • else P OPF (Stack, input)
  • Push (Pop(Stack), RHS(P))
  • od

29
(No Transcript)
30
Top-Down Parsing
  • Most parsing methods impose bounds on the amount
    of stack lookback and input lookahead. For
    programming languages, a common choice is (1,1).
  • We must define OPF (A,t), where A is the top
    element of the stack, and t is the first symbol
    on the input.
  • Storage requirements O(n2), where n is the size
    of the grammar vocabulary
  • (a few hundred).

31
LL(1) Grammars
  • Definition
  • A CFG G is LL(1) (Left-to-right, Left-most,
    one-symbol lookahead)
  • iff for all A??, and for all A??, A??, ? ? ?,
  • Select (A ? ?) n Select (A ? ?) ?
  • Previous example Grammar is not LL(1).
  • More later on why, and what do to about it.

32
Example
  • S ? A b,?
  • A ? bAd b
  • ? d, ?

Disjoint! Grammar is LL(1)!
d b ?
S S ? A S ? P
A A ? A ? bAd A ?
(At most) one production per entry.
33
Parsing
Programming Language Concepts Lecture 6
  • Prepared by
  • Manuel E. Bermúdez, Ph.D.
  • Associate Professor
  • University of Florida
Write a Comment
User Comments (0)
About PowerShow.com