Title: Parsing Context Free Grammars
1Parsing Context Free Grammars
- CIS 530
- Introduction to NLP
- (adapted from slides for CIS 530 by Edward Loper)
2Context Free Parsing
- A Context Free Grammar (CFG) specifies a set of
tree structures. - Each rule in a CFG licenses a piece of tree
structure - VP?V NP NP?Det N
- Context Free Parsing
- Given a CFG and a text, find all tree structures
- licensed by the CFG
- that span the text
- and whose root is S.
NP
VP
Det N
V NP
3Intuitions about Parsing
- How do people parse sentences by hand?
- Bottom-up Look for small pieces that you know
how to assemble, and work your way up to larger
pieces. - Top-down Start at the top of the tree with an S
node, and work your way down to the words.
4Parsing as a Search Problem (I)
- Search space
- A set of trees consistent with a given grammar
- Goal
-
- Find a single tree whose root is S
-
- and whose fringe is the sentence
- Size of search space
- Infinite (depends on the grammar)
5Structural Ambiguity Can Give Exponential Parses
. . .
I saw the man on the hill with the telescope
Me See A man The
telescope The hill
6- S ? NP VP
- VP ? V NP
- VP ? VP PP
- NP ? Det N
- NP ? Pronoun
- NP ? NP PP
- PP ? P NP
7- S ? NP VP
- VP ? V NP
- VP ? VP PP
- NP ? Det N
- NP ? Pronoun
- NP ? NP PP
- PP ? P NP
8- S ? NP VP
- VP ? V NP
- VP ? VP PP
- NP ? Det N
- NP ? Pronoun
- NP ? NP PP
- PP ? P NP
9Bottom Up Parsing
- Start with the words, and combine phrases until
you find an S that spans the sentence. - If the grammar contains A ? B C, then we can
combine a B and a C to form an A. - A simple breadth-first algorithm
- Start with the set of trees that just contain
each word. - If our set of trees contains
- And the grammar contains A ? B C
- Replace them with
10Bottom Up Parsing Issue
- Inefficiency
- Builds structures that are locally valid but not
useful globally. - E.g. it will build an NP for time flies in I
know that time flies.
Time flies like an arrow Fruit flies like a banana
11Top-Down Parsing
- Start with the single tree S, and build
downward until you reach the words. - If the grammar contains A ? B C, then we can
expand an A to B and C. - A simple depth-first algorithm
- Start with the tree S
- If the grammar contains A ? B C
- And one of the leaves of our tree is A
- Add B and C as children of A.
- Backtracking
- If we get to a word, and it doesnt match the
text, - then back up and try something else.
12Our Earlier Derivation was a Top Down
Nondeterministic Parse
Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
13Who does Bill think Harry likes?
S
S
NP
V
S
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
14Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
15Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
16Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
17Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
18Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
19Who does Bill think Harry likes?
S
S
NP
V
S
who
who
does
VP
does
NP
V
S
Bill
Bill
think
think
VP
NP
Harry
Harry
likes
20Top Down Parsing Issues
- "Left-recursive" rules can cause infinite loops
- NP ? DP N,
- DP ? NP s
- Explores trees that are inconsistent with the
input - Potential exponential time cost
- Redundant parsing of phrases.
- "I saw the dog in the tall building behind the
hill." - (the dog was in the
building) - "I saw the dog in the tall building behind the
hill." - (I was in the
building)
21My aunts cousins friends daughters car
- NP ? DP N
- DP ? Det
- DP ? NP s
22Parsing Issues Solutions
- Re-use the sub-parses we've already computed
- Combine top-down and bottom-up approaches
- Get the "best of both worlds"
- We need some common representation for the
information from top-down and bottom-up
approaches. - Use heuristics or clever algorithm to decide when
to use bottom-up or top-down approaches.
23Parsing as a Search Problem (II)
- Search space The set of phrasal extents
- PhraseType_at_startend
- E.g. NP_at_02
- Goal
- Find a set of paths through the search space
- That dont overlap
- And that connect S_at_0n to each word.
- Size of search space Gn2 (Ggrammar nwords)
- Time to search the space ?
- If we look at each phrasal extent once, Gn2
- otherwise, it might be more (exponential)
24Chart Parsing
- Use a chart to record hypotheses about possible
syntactic constituents. - A chart contains a set of edges.
- Each edge represents a possible phrase.
- Edges provide a common representation for parse
information.
25Edges
Edges can represent partial phrases new
hypotheses
PP
P
NP
I saw the man on the hill.
PP P NP
PP starts here
So far, we've found a P
We still need an NP
26Edges (continued)
- An edge consists of
- S A start index (1...n)
- E An end index (1...n)
- Type A phrase type (NP, PP, etc.)
- Found What we've found so far (list of phrase
types) - Need What we still need (list of phrase types)
27Chart Parser Rules
- A chart parser rule adds new edges to the chart.
- Each chart parsing strategy defines a set of
rules. - Top down
- top-down initialization rule
- top-down rule
- fundamental rule
- Bottom-up
- bottom-up rule
- fundamental rule
28The Fundamental Rule
- The fundamental rule is used by both top-down and
bottom-up strategies. - If the chart contains
Then add
A aC g
A a Cg
C b
i
i
j
k
k
A
C
A
a
g
C
b
a
C
g
b
29Top-Down Rules
- Top-down initialization
- For any rule S?a
- Add to the left side of the chart (start
end 0). - Top-down rule
- If the chart contains For each rule
Add
A a Yb
Y g
Y?g
j
A
Y
a
b
g
Y
30Bottom-Up Rules
- Bottom-Up Rule
- If the chart contains For each rule
Add
A a
B Ab
B?Ab
B
A
b
A
a
31Chart Parsing Terminology
- A dotted rule is a CFG rule with a dot on the
right hand side. - A dotted rule is complete if its dot is at the
end - Otherwise, a dotted rule is incomplete
- An edge is a dotted rule at a location
(startend) - An edge is complete if its dotted rule is
complete - A chart is a set of edges
- An agenda is a queue of edges to be extended
32Chart Parsing Strategies
- Chart parser rules define the basic operations.
- A strategy defines what rules are applied when.
- Strategies modify the agenda
- The chart parser discussed earlier keeps applying
every rule until no more edges are added. - But we can avoid redundant work with better
strategies. E.g. - Process edges in a fixed order
- Use a queue, and examine each edge once
33Chart Parsing as a Matrix
- We can represent a chart as an upper triangular
matrix. - charti,j is the set of dotted rules that span
ij
i
j
34CKY (Cocke-Kasami-Younger)
- A bottom-up chart parsing strategy
- Requires a grammar in Chomsky Normal Form
- Binary branching nonterminal rules
- A ? B C
- Unary terminal rules
- A ? w
- First, add lexical edges for each word.
- Then, for each width w (2 to N)
- Scan left to right, combining edges to form new
edges with width w
35CKY Overview
The man ate a cookie
w2
i
w3
0
1
2
3
4
5
w4
j
w5
- First, add the lexical edges
- Then, for each w, add edges of length w
36CKY Algorithm
- Add the lexical edges
- for w 2 to N
- for i 0 to N-w
- for k 1 to w-1
- if
- A?BC and
- B?? ? charti,ik and
- C?? ? chartik,iw
- then
- add A?BC to charti,iw
- If S?chart0,N, return the corresponding parse
i
i
j
j
37CKY Result
i
j
- Use backpointers to remember what we combined
38Left-to-Right BU Overview
The man ate a cookie
Col. 2
Col. 3
i
Col. 4
0
1
2
3
4
5
Col. 4
j
- First, add the lexical edges
- Then scan left to right, combining edges
39Left-to-Right BU Algorithm
- Add the lexical edges
- for j 2 to N
- for i 0 to j-2
- for k 1 to j-i-1
- if
- A?BC and
- B?? ? charti,ik and
- C?? ? chartik,j
- then
- add A?BC to charti,j
- If S?chart0,N, return the corresponding parse
i
i
j
j
i k
The man ate a cookie
j
40Earleys Algorithm
- Top-down chart parsing strategy
- With bottom-up filtering
- Applicable with any Grammar
- First, initialize with the top-down init rule
- For every grammar rule S??
- Add
- Then, scan left to right, applying 3 rules
- Predictor (top-down rule)
- Scanner (fundamental rule on terminals)
- Completer (fundamental rule on nonterminals)
41Earleys Algorithm Rules
Predictor
Initialization
Scanner
Completer
42Earleys Algorithm
- For each column (j), maintain a queue of edges.
- Initialization
- For every grammar rule S??, add to
queue0 - Process queues from left to right (0 to N).
- For each edge in the queue, apply one of 3 rules
- If its incomplete, and the next symbol after the
dot is a preterminal (i.e., a part of speech
tag), apply scanner. - If its incomplete, and the next symbol after the
dot is not a preterminal, apply predictor. - If its complete, apply completer.
43Earleys Algorithm Main
- For each rule S?? in the grammar
- Add S??? to chart0,0
- For i 0 to N
- for edge in queuei
- if edge is incomplete and edge.next is a part
of speech - scanner(edge)
- if edge is incomplete and edge.next is not a
POS - predictor(edge)
- if edge is complete
- completer(edge)
44Earleys Algorithm Predictor
Example
For each rule Add
B g
B?g
j
B
g
Input
Rule
45Earleys Algorithm Scanner
Example
For each rule Add
A aB b
B?wwhere (w,j, j1)
i
j1
A
a
b
B
w
Input
Rule
46Earleys Algorithm Completer
Example
For each edge in queuei
Add
B g
i
j
A aB b
A a Bb
k
j
k
i
B
A
A
g
a
b
B
a
b
B
g
Input
Rule
47Earleys Algorithm Example
- For each rule S?? in the grammar
- Add S??? to chart0,0
- For i 0 to N
- for edge in queuei
- if edge is incomplete and edge.next is a part
of speech - scanner(edge)
- if edge is incomplete and edge.next is not a
POS - predictor(edge)
- if edge is complete
- completer(edge)
48Pre-computing Predictor
- Predictor(A???B?, i,j)
- For every grammar rule B??
- add a rule B??? at j,j
- Note that this depends only on B and the grammar.
- We can save time by pre-computing predictor.
- For every grammar rule B??
- PREDICTB.append(B???)
- Repeat until nothing new is added
- For every B
- For every C??D? in PREDICTB
- For every grammar rule D??
- PREDICTB.append(D???)