Earleys Algorithm: General ContextFree Parsing - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Earleys Algorithm: General ContextFree Parsing

Description:

Depending on what sk 1...sn is, there might be a subtree formed from production ... ( When creating a parse tree, the A in this new item will have have children , as ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 25
Provided by: paulhil
Category:

less

Transcript and Presenter's Notes

Title: Earleys Algorithm: General ContextFree Parsing


1
Earleys Algorithm General Context-Free Parsing
  • Lecture 12
  • P. N. Hilfinger

2
Parsing General Context-Free Grammars
  • Shift-reduce parsing can work for most practical
    applications.
  • However, one must sometimes munge the grammar,
    though not as much as LL(1).
  • Cannot handle ambiguity, nor situations where
    resolving ambiguities requires looking far ahead.
  • Today, well look at a method that can Earleys
    Algorithm.
  • In fact, shift-reduce parsing is a highly
    optimized special case of this algorithm.

3
Earleys Algorithm Basic Idea
  • Scan tokens left-to-right.
  • At each point, keep track of all possible
    subtrees that could include the current point in
    the input, based on everthing seen so far.
  • At the end of the input, if there is a tree that
    is rooted at the start symbol, weve found a
    parse (possibly many).

4
Some Notation
  • If input is ss1s2sn then position k in the
    input is just after sk and before sk1, with
    position 0 at the beginning and position n at the
    end.
  • At each input position, k, compute a set of
    items, where each item has the form
  • A ? ? ? ?, m
  • where A ? ? ? is a production and 0mk.
  • Together, the items in the set describe all
    subtrees of possible parse trees that begin or
    end at position k or have a child that does.

5
Meaning of an Item
  • An item A ? ? ? ?, m at position k means
  • The input between positions m and k matches ?.
  • Depending on what sk1sn is, there might be a
    subtree formed from production A ? ? ? in the (or
    a) parse tree for the entire string.
  • So when ? is empty, means that there is a
    possible handle for A ? ? that ends at k.
  • So that leaves the problem of figuring out what
    items to put in each set.

6
Example
  • Grammar
  • E ? E T E ? T
  • T ? T int T ? int
  • Input
  • 0 int 1 2 int 3 4 int 5
  • At position 0, we expect to see an E to our
    right, formed from one of Es productions.
  • Plus, since an E can start with a T, we wont be
    surprised by a T formed from one of its
    productions.

7
Example Getting Started
int
0

1
E ? ? T, 0 E ? ? E T, 0

Start with items for start symbol E
T ? ? int, 0 T ? ? T int, 0
and (since E can start with T), also add
items for T
8
Closure Items
  • Whenever we have an item B?? ? A ?, j in item set
    m, it indicates that a substring producing A
    might start at this position.
  • Thats what the item A? ? ?, m means, so we also
    add those items (for each production A? ?) to
    item set m.
  • These are called closure items.
  • Other items are kernel items.

9
Example Computing next item set
int
0

1
E ? ? T, 0 E ? ? E T, 0 T ? ? int, 0 T ? ? T
int, 0
T ? int ?, 0
T ? T ? int, 0 E ? T ?, 0
E ? E ? T, 0
10
Computing next item set
  • For each item of the form A?? ? c ?, k in item
    set m, where csm1 is the next input symbol,
    insert A?? c ? ?, k in item set m1.
  • For each complete item, A?? ?, k in item set m1,
    and each item B?? ? A ?, j back in item set k,
    add item B?? A ? ?, j to item set m1. (When
    creating a parse tree, the A in this new item
    will have have children ?, as denoted by dashed
    red arrows in our examples).

11
Continuing the Example, Set 2

int
1
2
T ? int ?, 0
E ? E ? T, 0
closure items
T ? ? T int, 2
T ? T ? int, 0 E ? T ?, 0
T ? ? int, 2
E ? E ? T, 0
12
Continuing the Example, Set 3
int

2
3
E ? E ? T, 0
T ? int ?, 2
T ? T ? int, 2
T ? ? T int, 2
E ? E T ?, 0
T ? ? int, 2
from item set 0
E ? E ? T, 0
13
Continuing the Example, Sets 4 5
int

3
4
5
T ? int ?, 2
T ? T ? int, 2
T ? T int ?, 2
T ? T ? int, 2
T ? T ? int, 2
E ? E T ?, 0
E ? E T ?, 0
ACCEPT!
E ? E ? T, 0
E ? E ? T, 0
14
Accepting the String
  • In the last item set, have a completed item for
    the start symbol that started in set 0.
  • That means the input between 0 and end matches
    an entire production for the start symbol, so
    the string parses correctly.

15
Retrieving a Parse Tree or Derivation
  • Start with a completed item in the last set that
    produces the whole input (has form S?,0 for
    start symbol S).
  • Follow the red arrows to find how to expand that
    symbol.
  • Work backwards through the sets to find the
    expansions of the other nonterminals.

16
Getting a Tree from our Example (I)
int
5
E
T ? T int ?, 2
E T
T ? T ? int, 2
T
E ? E T ?, 0
To find out how to expand this T, go back to
chart 3 (before int)
start here
int
E ? E ? T, 0
17
Getting a Tree from our Example (II)
int
E
3
T ? int ?, 2
E T
T ? T ? int, 2
T
To find out how to expand this E, go back to
chart 1 (before )
E ? E T ?, 0
int
int
E ? E ? T, 0
18
Figuring out Where to Look
  • In the last slide, we had to figure out where to
    look for the derivation of the E in E T
  • We used the items
  • T ? T ? int, 2 and T ? int ?, 2
  • to get the T in E T, both of which tell us
    that the T started after item set 2.
  • And since is a terminal, we then have to go
    back one more.

19
Getting a Tree from our Example (III)
int
1
E
T ? int ?, 0
E T
T ? T ? int, 0 E ? T ?, 0
T
T
E ? E ? T, 0
int
int

int
start here
20
An Ambiguous Grammar (I)
  • Grammar
  • E ? E E E ? E E E ? int
  • Input
  • 0 int 1 2 int 3 4 int 5

0 int 1
E ? ? int, 0 E ? ? E E, 0 E ? ? E E, 0
E ? int ?, 0 E ? E ? E, 0 E ? E ? E, 0
21
An Ambiguous Grammar (II)
1 2
int 3
E ? int ?, 0 E ? E ? E, 0 E ? E ? E, 0
E ? E ? E, 0 E ? ? int, 2 E ? ? E E, 2 E ? ?
E E, 2
E ? int ?, 2 E ? E ? E, 2 E ? E ? E, 2 E ? E
E ?, 0 E ? E ? E, 0 E ? E ? E, 0
22
An Ambiguous Grammar (III)
3 4
int 5
E ? int ?, 2 E ? E ? E, 2 E ? E ? E, 2 E ? E
E ?, 0 E ? E ? E, 0 E ? E ? E, 0
E ? E ? E, 2 E ? E ? E, 0 E ? ? int, 4 E ? ?
E E, 4 E ? ? E E, 4
E ? int ?, 4 E ? E E ?, 2 E ? E E ?, 0 E ? E
? E, 4 E ? E ? E, 4 E ? E E ?, 0
There are two ways to produce the E starting at
0, reflecting ambiguity.
23
Just for Fun
Grammar is ferociously ambiguous produces ? an
infinite number of ways!
E ? E ? E E
0
E ? ?, 0 E ? ? E E, 0 E ? E ? E, 0 E ? E E ?, 0
! ! !
24
Relationship to LR Shift-Reduce Parsing
  • With an LR(1) grammar, never have item sets where
    two items have the same production, with the dot
    in the same place, but different starting
    positions.
  • So, ignoring the starting positions, there is a
    finite number of possible item sets.
  • These are the states in the shift-reduce parser.
Write a Comment
User Comments (0)
About PowerShow.com