Title: Compiler Design Chapter 3
Parsing - Predictive Parsing
2Predicative Parsing
- Recursive decent simple algorithm easily
parse some grammars - Each production turns into a clause of recursive
- Recursive Descent Parser for this grammar
- One function per non-terminal symbol
- One clause per production
5Recursive Descent
- Recursive Descent / Predictive Parsing
- Works only when first terminal symbol of each
subexpression provides enough information to
choose - FIRST sets
- Parser generators rather use LR(1) parsing
(later in course) - Predictive parsing algorithm simple enough
to construct parsers manually (without tools)
6FIRST sets
Let ? be a string with terminal and nonterminal
symbols FIRST(?) the set of all terminal
symbols that can begin any string derived from ?
? T F, FIRST(T F) id, num,
7Overlapping FIRST sets
X ? ?1
X ? ?2
If FIRST(?1) and FIRST(?2) overlaps, recursive-des
cent parsing cannot be used
If terminal symbol I is in FIRST(?1) and also
FIRST(?2), the function X() do not know what to
do (i.e., which production to use) given I.
8FIRST sets and nullable symbols
FIRST(X Y Z) not always the same as FIRST( X )
- Y produces empty string therefore
- X produces empty string therefore
- FIRST( X Y Z ) must include FIRST( Z )
- nullable symbols
- - Symbols that can produce empty string
- Must know what might follow nullable symbols
9nullable, FIRST and FOLLOW sets
- Let ? be a string with terminal and nonterminal
symbols - nullable(X) is true if X can derive empty string
- FIRST(?) the set of all terminal symbols that
can begin any string derived from ? - FOLLOW(X) the set of all terminal symbols that
can immediately follow X. - t ? FOLLOW(X) if there is any derivation
containing Xt - t ? FOLLOW(X) if derivation contains X Y Zt
where Y and Z both derive e (empty string)
Smallest sets having the following properties
11Nullable, FIRST and FOLLOW sets Algorithm
Three relations do not need to be computed
13Example - First iteration
14Example Second iteration
? is nullable if each symbol in ? is nullable
16Constructing a Predictive Parser
Can construct 2-dimnsinal predictive parsing
table indexed by nonterminals X and terminals T
to choose the right production for a input token
17Constructing a Predictive Parser
- Consider each X ? ?
- enter X ? ? in row X column T for each T ?
FIRST(?) - For Z ? d FIRST(d) d
- For Z ? X Y Z FIRST(X Y Z) a,c,d
- For Y ? c FIRST(c) c
- For X ? Y FIRST(Y) c
- For X ? a FIRST(a) a
- if ? is nullable, enter X ? ? in row X column T
for each T ? FOLLOW(X) - For Y ? FOLLOW(Y) a,c,d
- For X ? Y FOLLOW(X) a,c,d
18Duplicate entries in Predictive Parsing Table
- Predictive parser will not work if there are
duplicate entries in the predictive parsing
table. - Grammar is ambiguous
- Sentence d more than one parsing table
19LL(1) Grammar
- Grammar with no duplicate entries in predictive
parsing table LL(1) grammar - Stands for
- Left-to-right parse,
- Left-most derivation,
- 1 symbol lookahead
- LL(k) parsing table rows are non-terminals,
columns every sequence of k terminals - Rarely done large tables
- No ambiguous grammar is LL(k) for any k
20Left Recursion
- cause duplicate entries since any tokenin
FIRST(T) will also be in FIRST(ET) - E appears as the first right-hand-side symbol in
an E-production
- This is called left recursion
- Left-recursive grammars cannot be LL(1)
21Eliminating Left Recursion
Introduce a new nonterminal E and re-write using
right recursion
a does not start with X
Eliminate left-recursion
24Nullable, FIRST FOLLOW
25Predictive Parser
enter X ? g in row X column T for each T ?
if g is nullable, enter X ? g in row X column T
for each T ? FOLLOW(X)
- For S?E FIRST(E ) ( id num
- For E?T E FIRST(T E) ( id num
- For E?-TE FIRST(-TE) -
26Left Factoring
- When two productions for the same non-terminal
symbols start with the same symbol - Cause problems in predictive parsing
Left factor
take the allowable endings and make a new
nonterminal X
27Error Handling
Blank entry in table syntax error
T() does not expect to see other tokens
28Error Handling
- If an error is found can raise exception and
quit parsing, which is not user friendly - Error Recovery by deleting, replacing, or
inserting tokens. - Error Recovery by insertion pretend num token
was found, print error message and return
normally so that more errors can be found - Cascading errors might cause infinite looping
29Error Handling
- Error Recovery by deletion skip tokens until a
token in the FOLLOW set is reached
30Error Handling
- Recursive Descent error recovery must be
adjusted to avoid a long cascade of error-repair