Syntax - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Syntax

Description:

a general philosophical theory of signs and symbols that deals especially with ... or expressions in abstraction from their signification and their interpreters ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 47
Provided by: juancarl
Category:

less

Transcript and Presenter's Notes

Title: Syntax


1
Syntax
  • Juan Carlos Guzmán
  • CS 3123 Programming Languages Concepts
  • Southern Polytechnic State University

2
What does your DOS computer do when ?
  • gt copy a.txt b.txt
  • gt copy a.txt a.txt
  • gt del .
  • gt del 01.
  • gt type a.txt gt null
  • gt type a.txt gt nul

3
  • How do we know the meaning of our commands?

4
Semiotic
  • Synthesized from Merriam-Webster (m-w.com)
  • a general philosophical theory of signs and
    symbols that deals especially with their function
    in both artificially constructed and natural
    languages and comprises
  • syntactics
  • the formal relations between signs or expressions
    in abstraction from their signification and their
    interpreters
  • semantics
  • the relations between signs and what they refer
    to
  • pragmatics
  • the relation between signs or linguistic
    expressions and their users

5
Syntax
  • Two levels
  • The language level, properly known as parsing
  • The lexeme level, known as lexing
  • More information about this topic can be found in
  • Aho, Sethi, Ullman. Compilers Principles,
    Techniques, and Tools. Addison-Wesley, 1988.
    (on reserve, The Dragon book)

6
Lexing
  • Specification of the lexemes of the language
  • A class of lexemes is known as a token
  • Tokens are specified in regular expressions
  • letter, empty string
  • concatenation
  • choice
  • closure
  • Many convenient extensions
  • Recognized by Finite Automata
  • Limited in Power cannot count, cannot recognize
    anbn

7
Sample Regular Expressions
  • digit (0 1 2 3 4 5 6 7 8 9)
  • ldigit (1 2 3 4 5 6 7 8 9)
  • natural ldigit digit
  • integer ( - ?) (natural 0)
  • How about floating points?
  • W/o exponents
  • add the exponents

8
Parsing
  • Specification of the language structure
  • The parser
  • recognizes the phrase, and
  • reconstructs its structure (parse tree)

9
Context-Free Grammars
  • Generate Context-Free Languages
  • Allow recursion
  • Are specified as G(N,T,P,S) where
  • N is the set of non-terminals, or variables
  • T is the alphabet
  • P the production set
  • S the starting symbol for every phrase

10
CFG (Example)
  • G1 (S,A,B, a,b, P, S)
  • where P S ? ASB, S ? BSA, S ? ?, A ?a, B
    ?b
  • G2 (E, a,,,(,), P, E)
  • where P E ? EE, E ? EE,
  • E ? a, E ? (E)

11
Grammars (conventions)
  • The empty string ?
  • First uppercase letters of the alphabet (A, B, C,
    )
  • gt Non-terminal
  • First lowercase letters of the alphabet (a, b, c,
    ), or numbers (1, 2, )
  • gt Terminal
  • First lowercase greek letters (?, ?, ?,),
  • gt string of terminals and non-terminals
  • Last lowercase letters of the alphabet (t, u,
    v,)
  • gt string of terminals

12
Derivation
  • How do we generate phrases in the language?
  • By using a derivation
  • ?A? gt ??? iff A?? ? P
  • E gt EE gt EEE gt aEE gt aEa gt aaa

13
The Language Generated
  • The language generated by the grammar is composed
    of all strings of terminals that can be derived
    from S by applying productions rules one or more
    times
  • Anything derived from S is called a sentential
    form

14
Derivations
  • Leftmost derivation the leftmost non-terminal is
    always reduced
  • E gt EE gt EEE gt aEE gt aaE gt aaa
  • Rightmost derivation the rightmost non-terminal
    is always reduced
  • E gt EE gt EEE gt EEa gt Eaa gt aaa

15
Parse Tree
  • A structured sequence of derivations
  • Visually appealing
  • From previous example

16
Ambiguous Grammar
  • Two different parse trees for a single phrase
  • Just one phrase with two trees is proof of
    ambiguity
  • Not ambiguous? All phrases must have only one
    parse tree!
  • An ambiguous grammar is quite different from an
    inherently ambiguous language

17
Grammars vs. Languages
  • A language is a set
  • A grammar is a medium by which the set can be
    formally specified
  • Many grammars specify the same set

18
An Expression Grammar
  • The grammar for expressions presented before was
    ambiguous
  • Non-ambiguous, with correct precedence (relative
    priority given to and )
  • E ? E T T
  • T ? T F F
  • F ? a ( E )

E
E
T

F
T

T
a
F
F
a
a
19
Parsing Styles
  • Top-down to derive w from S, start from S,
    derive until w is obtained
  • Bottom-up to derive w from S, try doing reverse
    derivations from w until S is obtained

20
Parsing Styles
  • Top-down LL(k)
  • Easy to implement and understand
  • hand-coded
  • table-driven
  • Limited use, many problems
  • Bottom-up LR(k)
  • More difficult to understand
  • table driven
  • A nice trade-off between complexity and generality

21
An Expression Grammar
  • G (E,T,F,a,,,(,),P,E)
  • where P
  • E ? T E T,
  • T ? F T F,
  • F ? a ( E )
  • Does aaa in L(G)?

E
T
E

T
F
T

F
a
a
F
a
22
(No Transcript)
23
A Grammar for a Small Language
  • ?program? ? begin ?stmt_list? end
  • ?stmt_list? ? ?stmt?
  • ? ?stmt? ?stmt_list?
  • ?stmt? ? ?var? ?expression?
  • ?var? ? A ? B ? C
  • ?expression? ? ?var? ?var?
  • ? ?var? - ?var?
  • ? ?var?

24
Predictive Parsing
  • How many characters of look-ahead are needed to
    predict the next production to take?
  • Is this a finite number?
  • Is it 1?

25
Another Expression Grammar
  • G (E,E,T,T,F,a,,,(,),P,E)
  • where P
  • E ? T E,
  • E ? T E ?,
  • T ? F T,
  • T ? F T ?,
  • F ? a ( E )
  • Does aaa in L(G)?

E
T
E
T

F
E
T
?
T
F
a
?
a

F
T
a
?
26
LL(1) Parsing Table
27
LL(1) Algorithm
input
stack
  • Parse(a1 an, X1 Xm)
  • if (a1) (X1)
  • accept
  • else if X1 is a terminal and (X1a1)
  • Parse(a2 an, X2 Xm) // match
  • else if TableX1,a1 X1?Y1 Yk
  • Parse(a1 an, Y1 Yk X2 Xm) / derive
  • else
  • fail
  • Call initially with Parse(w,S), where w is the
    phrase to parse and S is the starting symbol of
    the grammar

ai is a terminal Xj Yk are terminals or
nonterminals
28
Parser Operation on aaa
  • INPUT
  • a a a
  • a a a
  • a a a
  • a a a
  • a a
  • a a
  • a a
  • a a
  • a a
  • a a
  • a
  • a
  • a
  • a
  • STACK
  • E
  • T E
  • F T E
  • a T E
  • T E
  • E
  • T E
  • T E
  • F T E
  • a T E
  • T E
  • F T E
  • F T E
  • a T E
  • T E
  • E

OPERATION derive derive derive match derive derive
match derive derive match derive match derive mat
ch derive derive accept
Sentential Form E T E F T E a T E a
T E a E a T E a T E a F T E
a a T E a a T E a a F T E a
a F T E a a a T E a a a T E
a a a E a a a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
29
Note how the leftmost derivation of aaa is done
Sentential Form E T E F T E a T E a
T E a E a T E a T E a F T E
a a T E a a T E a a F T E a
a F T E a a a T E a a a T E
a a a E a a a
E
T
E
T

F
E
T
?
T
F
a
?
a

F
T
a
?
30
Whats the Table Lookup
  • Note that the predictive nature of the parser
    guarantees the uniqueness of the entry for
    TableA,b (or no entry at all)
  • When attempting to derive nonterminal A, the
    look-ahead b must give the correct rule to apply
  • This b can be
  • the initial character of the derivation of A,
    i.e., A ? b?,
  • or, it can be the initial character of the
    derivation of what follows A! (A ? ?)

31
First Sets
  • first(?) is the set of one-character prefixes of
    strings of terminals that can be derived from ?
  • If the empty string can be derived from ?, then
    it will also be in the set
  • if ? ? aw then a ? first(?)
  • if ? ? ? then ? ? first(?)

32
First Sets (II)
  • first(?) ?
  • first(a) a
  • first(A) first(?1) ? ? first(?n)
  • if A? ?1 ? P, , A? ?n ? P
  • first(X?) first(X)?first(?)
  • where X is either terminal or nonterminal

33
Bounded Concatenation
  • In computing first(X?), our interest is to obtain
    one-character prefixes (or ?)
  • Consider the operation at the char level
  • ? ? ? ?, where ? is either ? or a terminal
  • a ? ? a
  • Generalize it to work on sets
  • A?B v?w v?A, w?B, where A B are sets

34
Computation of First Sets
35
Computation of First Sets
36
Follow Sets
  • Follow(A) is the set of prefixes of strings of
    terminals that can follow any derivation of A in
    G
  • ? follow(S)
  • if (B??A?) ? P, then
  • first(?)?follow(B)? follow(A)
  • The definition of follow usually results in
    recursive set definitions. In order to solve
    them, you need to do several iterations on the
    equations
  • ? never appears in any follow set
  • Note I had promised a closed definition of
    follow, but it will be unnecessarily complex.
    JCG.

37
Computation of Follow Sets
38
Computation of Follow Sets
39
How to Fill In the Table (Predict)
  • For each production (A??) ? P
  • let X first(?)?follow(A)
  • then for all x ? X
  • B?? ? TableA,x
  • After processing all productions, each cell of
    the table must have, at most, one production
  • if not, your grammar is not LL(1) (nice try!)

40
First Follow Sets
41
Predict
42
Yet Another Expression Grammar (its in the book!)
  • G (E,T,F,a,,,(,),P,E)
  • where P
  • ? E ? E T,
  • ? E ? T,
  • ? T ? T F,
  • ? T ? F,
  • ? F ? ( E ),
  • ? F ? a
  • Does aaa in L(G)?

E
T
E


T
T
F
a
F
F
a
a
43
LR(1) Parsing Table
Sn shift to state n Rn reduce according to
production n
44
LR(1) Algorithm
input
stack
  • Parse(S0X1S1X2S2 XrSr XmSm,a1 an)
  • if ActionSm,a1 Shift S
  • Parse(S0X1S1X2S2 XmSma1S,a2 an)
  • else if ActionSm,a1 Reduce A ? Xr1 Xm
  • and GOTOSr,A S
  • Parse(S0X1S1X2S2 XrS,a1 an)
  • else if ActionSm,a1 Accept
  • accept
  • else if ActionSm,a1 Error
  • error
  • Call initially with Parse(S0,w), where w is the
    phrase to parse and S0 is the initial state of
    the table

ai is a terminal Xj Yk are terminals or
nonterminals Si is a state
45
Parser Operation on aaa
  • STACK
  • 0
  • 0 a 5
  • 0 F 3
  • 0 T 2
  • 0 E 1
  • 0 E 1 6
  • 0 E 1 6 a 5
  • 0 E 1 6 F 3
  • 0 E 1 6 T 9
  • 0 E 1 6 T 9 7
  • 0 E 1 6 T 9 7 a 5
  • 0 E 1 6 T 9 7 F 10
  • 0 E 1 6 T 9
  • 0 E 1

OPERATION S 5 R 6, G0,F R 4, G0,T R 2,
G0,E R 6 S 5 R 6, G6,F R 4, G6,T S 7 S 5 R
6, G7,F R 3, G7,T R 1, G0,E accept
Sentential Form a a a a a a F a a
T a a E a a E a a E a a
E F a E T a E T a E T a
E T F E T E
INPUT a a a a a a a a a
a a a a a a a a
1 2 3 4 5 6 7 8 9 10 11 12 13 14
46
Note how the rightmost derivation of aaa is done
Sentential Form E E T E T F E T a
E T a E T a E F a E a a
E a a E a a T a a F a a
a a a a a a
E
T
E


T
T
F
a
F
F
a
a
Write a Comment
User Comments (0)
About PowerShow.com