Title: Lecture 13 Parsing and Ambiguity
1Lecture 13 Parsing and Ambiguity
2Given a string x and a CFG G (V, S, R,
S), determine whether x L(G) and if x
L(G), find a derivation S x.
This problem is called Parsing.
To solve the parsing problem, we first study the
parse tree.
3The parse tree is the graph representation of a
derivation, which can be defined in the following
way
- A vertex with a label which is a nondeterminal
- symbol is a parse tree.
(2) If A ? y1y2 yn is a rule in R, then the tree
A
y1
y2
yn
. . .
is a parse tree.
4(3) If A ? e is a rule in R, then
A
e
is a parse tree.
(4) If a parse tree has a leaf which is the
root of another parse tree, then their
union is a parse tree.
(5) Nothing else is a parse tree.
5Each derivation has a parse tree.
Consider CFG G (S, a, b, c, R, S) where R
S ? SbS ScS a.
The derivation S SbS SbScS abScS
abSca abaca has the following parse tree.
S
S
S
b
c
S
S
a
a
a
6But, a parse tree may be owned by several
derivations.
For example, the derivation S SbS SbScS
SbSca abSca abaca Has the same parse
tree as above.
7Leftmost derivation
A derivation S y is called a leftmost
derivation and write S
y if y is obtained from S by a sequence
of steps at each of which apply a rule to the
leftmost nonterminal symbol.
left
left
S SbS abS abScS abacS abaca
Each parse tree uniquely corresponds exactly
one leftmost derivation.
8The parse tree for S x in L(G) has at
least x leaves their concatenation is x.
9ambiguous
A string x in L(G) may have two or more parse
tree witness S x. The grammar G is said to
be ambiguous if such a case exists.
CFG G (S, a, b, c, R, S) where R S ?
SbS ScS a is ambiguous because abaca has two
parse trees.
S
S
S
c
S
S
b
S
b
S
S
a
a
S
c
S
a
a
a
a
10How to remove ambiguity is an important issue in
theory of compiler. However, determine whether a
CFG is ambiguous is undecidable.
CFG G (S, A, 0,1, R, S) where R S ? A00,
A ? e AA 0 1 is ambiguous because 00 has
two parse trees
S
S
A 0 0
A 0 0
A A
e
e e
11The ambiguity for this CFG can be removed by
removing rule A ? e .
CFG G (S, A, 0,1, R, S) where R S ? 00
A00, A ? AA 0 1
12Parsing Algorithm
A string w in (V U S) is a left sentential form
if S w.
left
The leftmost graph g(G) for CFG G is defined as
follows
(a) vertex set the set of all left sentential
forms
(b) there exists directed edge (x, y) if x
y.
left
Usually, g(G) is an infinite digraph.
13If no rule in form A ? e exists, then g(G) is
nondecreasing and hence a depth-first search or
breath-first search would solve the parsing
problem.