Module 28 - PowerPoint PPT Presentation

1 / 189
About This Presentation
Title:

Module 28

Description:

Module 28 Context Free Grammars Definition of a grammar G Deriving strings and defining L(G) Context-Free Language definition – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 190
Provided by: EricT98
Learn more at: http://www.cse.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: Module 28


1
Module 28
  • Context Free Grammars
  • Definition of a grammar G
  • Deriving strings and defining L(G)
  • Context-Free Language definition

2
Context-Free Grammars
  • Definition

3
Definition
  • A context-free grammar G (V, S, S, P)
  • V finite set of variables (nonterminals)
  • S finite set of characters (terminals)
  • S start variable
  • element of V
  • role is similar to that of q0 for an FSA or NFA
  • P finite set of grammar rules or production
    rules
  • Syntax of a production
  • variable ? string of variables and terminals

4
English Context-Free Grammar
  • ECFG (V, S, S, P)
  • V ltsentencegt, ltnoun phrasegt, ltverb phrasegt,
    ...
  • people sometimes use lt gt to delimit variables
  • In this course, we generally will use capital
    letters to denote variables
  • S a, b, c, ..., z, , ,, ., ...
  • S ltsentencegt
  • P ltsentencegt ? ltnoun phrasegt ltverb phrasegt
    ltpctgt, ltnoun phrasegt ? ltarticlegt ltadjgt ltnoungt,
    ...

5
aibi igt0 CFG
  • ABG (V, S, S, P)
  • V S
  • S a, b
  • S S
  • P S ? aSb, S ? ab or S ? aSb ab
  • second format saves some space

6
Context-Free Grammars
  • Deriving strings, defining L(G), and defining
    context-free languages

7
Defining ?, gt notation
  • First ? notation
  • This is used to define the productions of a
    grammar
  • S ? aSb ab
  • Second gtG notation
  • This is used to denote the application of a
    production rule from a grammar G
  • S gtABG aSb gtABG aaSbb gtABG aaabbb
  • We say that string S derives string aSb (in one
    step)
  • We say that string aSb derives string aaSbb (in
    one step)
  • We say that string aaSbb derives string aaabbb
    (in one step)
  • We often omit the grammar subscript when the
    intended grammar is unambiguous

8
Defining gt continued
  • Third gtkG notation
  • This is used to denote k applications of
    production rules from a grammar G
  • S gt2ABG aaSbb
  • We say that string S derives string aaSbb in two
    steps
  • aSb gt2ABG aaabbb
  • We say that string aSb derives string aaabbb in
    two steps
  • We often omit the grammar subscript when the
    intended grammar is unambiguous

9
Defining gt continued
  • Fourth gtG notation
  • This is used to denote 0 or more applications of
    production rules from a grammar G
  • S gtABG S
  • We say that string S derives string S in 0 or
    more steps
  • S gtABG aaSbb
  • We say that string S derives string aaSbb in 0 or
    more steps
  • aSb gtABG aaSbb
  • We say that string aSb derives string aaSbb in 0
    or more steps
  • aSb gtABG aaabbb
  • We say that string aSb derives string aaabbb in 0
    or more steps
  • We often omit the grammar subscript when the
    intended grammar is unambiguous

10
Defining derivations
  • Derivation of a string x
  • The complete step by step derivation of a string
    x from the start variable S
  • Key fact each step in a derivation makes only
    one application of a production rule from G
  • Example Derivation of string aaabbb using ABG
  • S gtABG aSb gtABG aaSbb gtABG aaabbb
  • Example 2 AG (V, S, S, P) where P S ?SS a
  • Deriving string aaa
  • S gt SS gt Sa gt SSa gt aSa gt aaa

11
Defining L(G)
  • Generating strings
  • If S gtG x, then grammar G generates string x
  • Note G generates strings which contain terminals
    and nonterminals
  • aSb contains nonterminals and terminals
  • S contains only nonterminals
  • aaabbb contains only terminals
  • L(G)
  • The set of strings over S generated by grammar G
  • Note we only consider terminal strings generated
    by G
  • aibi i gt 0 L(ABG)
  • ai i gt 0 L(AG)

12
Context-Free Languages
  • Context-Free Languages
  • A language L is a context-free language (CFL) iff
  • Results so far
  • ai i gt 0 is a CFL
  • One CFG G such that L(G) this language is AG
  • Note this language is also regular
  • aibi i gt 0 is a CFL
  • One CFG G such that L(G) this language is ABG
  • Note this language is NOT regular

13
Example
  • Let BAL the set of strings over (,) in which
    the parentheses are balanced
  • Prove that BAL is a CFL
  • To prove this, you need to come up with a CFG
    BALG such that L(BALG) BAL
  • BALG (V, S, S, P)
  • V S
  • S (, )
  • S S
  • P ?
  • Give derivations of ((( ))) and ( )(( )) with
    your grammar

14
Module 29
  • Parse/Derivation Trees
  • Leftmost derivations, rightmost derivations
  • Ambiguous Grammars
  • Examples
  • Arithmetic expressions
  • If-then-else Statements
  • Inherently ambiguous CFLs

15
Context-Free Grammars
  • Parse Trees
  • Leftmost/rightmost derivations
  • Ambiguous grammars

16
Parse Tree
  • Parse/derivation trees are structured derivations
  • The structure graphically illustrates semantic
    information about the string
  • Formalization of concept we encountered in
    regular languages unit
  • Note, what we saw before were not exactly parse
    trees as we define them now, but they were close

17
Parse Tree Example
  • Parse tree for string ( )(( )) and grammar BALG
  • BALG (V, S, S, P)
  • V S, S (, ), S S
  • P S ? SS (S) l
  • One derivation of ( )(( ))
  • S gt SS gt (S)S gt ( )S gt ( )(S) gt (
    )((S)) gt ( )(( ))
  • Parse tree

18
Comments about Example
  • Syntax
  • draw a unique arrow from each variable to each
    character that is a direct child of that variable
  • A line instead of an arrow is ok
  • The derived string can be read in a left to right
    traversal of the leaves
  • Semantics
  • The tree graphically illustrates the nesting
    structure of the string of parentheses

19
Leftmost/Rightmost Derivations
  • There is more than one derivation of the string (
    )(( )).
  • S gt SS gt (S)S gt( )S gt ( )(S)
  • gt ( )((S)) gt ( )(( ))
  • S gt SS gt (S)S gt (S)(S) gt ( )(S)
  • gt ( )((S)) gt ( )(( ))
  • S gt SS gt S(S) gt S((S)) gt S(( ))
  • gt (S)(( )) gt( )(( ))
  • Leftmost derivation
  • Leftmost variable is always expanded
  • Which one of the above is leftmost?
  • Rightmost derivation
  • Rightmost variable is always expanded
  • Which one of the above is rightmost?

20
Comments
  • Fix a string and a grammar
  • Any derivation corresponds to a unique parse tree
  • Any parse tree can correspond to many different
    derivations
  • Example
  • The one parse tree corresponds to all three
    derivations
  • Unique mappings
  • For any parse tree, there is a unique
    leftmost/rightmost derivation that it corresponds
    to
  • S gt SS gt (S)S gt( )S gt ( )(S)
  • gt ( )((S)) gt ( )(( ))
  • S gt SS gt (S)S gt (S)(S) gt ( )(S)
  • gt ( )((S)) gt ( )(( ))
  • S gt SS gt S(S) gt S((S)) gt S(( ))
  • gt (S)(( )) gt( )(( ))

21
Example
  • S gt SS gt SSS gt (S)SS gt ( )SS gt ( )S gt
    ( )
  • The above is a leftmost derivation of the string
    ( ) from the grammar BALG
  • Draw the corresponding parse tree
  • Draw the corresponding rightmost derivation
  • S gt (S) gt (SS) gt (S(S)) gt (S( )) gt ((
    ))
  • The above is a rightmost derivation of the string
    (( )) from the grammar BALG
  • Draw the corresponding parse tree
  • Draw the corresponding leftmost derivation

22
Ambiguous Grammars
  • Examples
  • Arithmetic Expressions
  • If-then-else statements
  • Inherently ambiguous grammars

23
Ambiguous Grammars
  • A grammar G is ambiguous if there exists a string
    x in L(G) with two or more distinct parse trees
  • (2 or more distinct leftmost/rightmost
    derivations)
  • Example
  • Grammar AG is ambiguous
  • String aaa in L(AG) has 2 rightmost derivations
  • S gt SS gt SSS gt SSa gt Saa gt aaa
  • S gt SS gt Sa gt SSa gt Saa gt aaa

24
2 Simple Examples
  • Grammar BALG is ambiguous
  • String ( ) in L(BALG) has gt1 leftmost derivation
  • S gt (S) gt ( )
  • S gt (S) gt (SS) gt(S) gt( )
  • Give another leftmost derivation of ( ) from BALG
  • Grammar ABG is NOT ambiguous
  • Consider any string x in aibi i gt 0
  • There is a unique parse tree for x

25
Legal Arithmetic Expressions
  • Develop a grammar MATHG (V, S, S, P) for the
    language of legal arithmetic expressions
  • S 0, 1, , , -, /, (, )
  • Strings in the language include
  • 0
  • 10
  • 1011111100
  • 10(11111100)
  • Strings not in the language include
  • 10
  • 11101
  • )(

26
Grammar MATHG1
  • V E, N
  • S 0, 1, , , -, /, (, )
  • S E
  • P
  • E ? N EE EE E/E E-E (E)
  • N ? N0 N1 0 1

27
MATHG1 is ambiguous
E ? N EE EE E/E E-E (E)N ? N0 N1
0 1
  • Come up with two distinct leftmost derivations of
    the string 11011
  • E gt EE gt NE gt N1E gt 11E gt 11EE
    gt 11NE gt 110E gt 110N gt 110N1 gt
    11011
  • E gt EE gt EEE gt NEE gt N1EE gt
    11EE gt 11NE gt 110E gt 110N gt
    110N1 gt11011
  • Draw the corresponding parse trees

28
Corresponding Parse Trees
  • E gt EE gt NE gt N1E gt 11E gt 11EE
    gt 11NE gt 110E gt 110N gt 110N1 gt
    11011
  • E gt EE gt EEE gt NEE gt N1EE gt
    11EE gt 11NE gt 110E gt 110N gt
    110N1 gt11011

E
E
29
Parse Tree Meanings
Note how the parse trees captures the semantic
meaning of string 11011. More specifically,
what number does the first parse tree
represent? What number does the second parse
tree represent?
30
Implications
  • Two interpretations of string 11011
  • 11(011) 11
  • (110)11 1001
  • What if a line in a program is
  • MSU_Tuition 11011
  • What is MSU_Tuition?
  • Depends on how the expression 11011 is parsed.
  • This is not good.
  • Ambiguity in grammars is undesirable,
    particularly if the grammar is used to develop a
    compiler for a programming language like C.
  • In this case, there is an unambiguous grammar for
    the language of arithmetic expressions

31
If-Then-Else Statements
  • A grammar ITEG (V, S, S, P) for the language of
    legal If-Then-Else statements
  • V (S, BOOL)
  • S Dlt85, Dgt50, grade3.5, grade3.0, if, then,
    else
  • S S
  • P
  • S ? if BOOL then S else S if BOOL then S
    grade3.5 grade3.0
  • BOOL ? Dlt85 Dgt50

32
ITEG is ambiguous
S ? if BOOL then S grade3.5 grade3.0 if
BOOL then S else S BOOL ? Dlt85 Dgt50
  • Come up with two distinct leftmost derivations of
    the string
  • if Dlt85 then if Dgt50 then grade3.5 else
    grade3.0
  • S gtif BOOL then S else S gt if Dlt85 then S
    else S gt if Dlt85 then if BOOL then S else S gt
    if Dlt85 then if Dgt50 then S else S gt if Dlt85
    then if Dgt50 then grade3.5 else S gt if Dlt85
    then if Dgt50 then grade3.5 else grade3.0
  • S gtif BOOL then S gt if Dlt85 then S gt if
    Dlt85 then if BOOL then S else S gt if Dlt85 then
    if Dgt50 then S else S gt if Dlt85 then if Dgt50
    then grade3.5 else S gt if Dlt85 then if Dgt50
    then grade3.5 else grade3.0
  • Draw the corresponding parse trees

33
Corresponding Parse Trees
  • S gtif BOOL then S else S gt if Dlt85 then S
    else S gt if Dlt85 then if BOOL then S else S gt
    if Dlt85 then if Dgt50 then S else S gt if Dlt85
    then if Dgt50 then grade3.5 else S gt if Dlt85
    then if Dgt50 then grade3.5 else grade3.0
  • S gtif BOOL then S gt if Dlt85 then S gt if
    Dlt85 then if BOOL then S else S gt if Dlt85 then
    if Dgt50 then S else S gt if Dlt85 then if Dgt50
    then grade3.5 else S gt if Dlt85 then if Dgt50
    then grade3.5 else grade3.0

S
S
34
Parse Tree Meanings
S
S
if
B
then
S
if
S
B
then
S
else
S
else
if
Dlt85
B
then
S
if
Dlt85
grade3.0
B
then
S
Dgt50
grade3.5
grade3.0
Dgt50
grade3.5
If you receive a 90 on type D points, what is
your grade? By parse tree 1 By parse tree 2
35
Implications
  • Two interpretations of string
  • if Dlt85 then if Dgt50 then grade3.5 else
    grade3.0
  • Issue is which if-then does the last ELSE attach
    to?
  • This phenomenon is known as the dangling else
  • Answer Typically, else binds to NEAREST if-then
  • In this case, there is an unambiguous grammar for
    handling if-thens as well as if-then-elses

36
Inherently ambiguous CFLs
  • A CFL L is inherently ambiguous iff for all CFGs
    G such that L(G) L, G is ambiguous
  • Examples so far
  • None of the CFLs weve seen so far are
    inherently ambiguous
  • While the CFGs weve seen ambiguous, there do
    exist unambiguous CFGs for those CFLs.
  • Later result
  • There exist inherently ambiguous CFLs
  • Example aibjck ij or jk or ijk
  • Note ijk is unnecessary, but I added it here
    for clarity

37
Summary
  • Parse trees illustrate semantic information
    about strings
  • Ambiguous grammars are undesirable
  • This means there are multiple parse trees for
    some string
  • These strings can be interpreted in multiple ways
  • There are some heuristics people use for taking
    an ambiguous grammar and making it unambiguous,
    but this is not the focus of this course
  • There are some inherently ambiguous CFLs
  • Thus, the above heuristics do not always work

38
Module 30
  • EQUAL language
  • Designing a CFG
  • Proving the CFG is correct

39
EQUAL language
  • Designing a CFG

40
EQUAL
  • EQUAL is the set of strings over a,b with an
    equal number of as and bs
  • Strings in EQUAL include
  • aabbab
  • bbbaaa
  • abba
  • Strings in a,b not in EQUAL include
  • aaa
  • bbb
  • aab
  • ababa

41
Designing a CFG for EQUAL
  • Think recursively
  • Base Case
  • What is the shortest possible string in EQUAL?
  • Production Rule

42
Recursive Case
  • Recursive Case
  • Now consider a longer string x in EQUAL
  • Since x has length gt 0, x must have a first
    character
  • This must be a or b
  • Two possibilities for what x looks like
  • x ay
  • What must be true about relative number of as
    and bs in y?
  • x bz
  • What must be true about relative number of as
    and bs in z?

43
Case 1 xay
  • x ay where y has one extra b
  • What must y look like?
  • Some examples
  • b
  • babba
  • aabbbab
  • aaabbbb
  • Is there a general pattern that applies to all of
    the above examples?
  • More specifically, show how we can decompose all
    of the above strings y into 3 pieces, two of
    which belong to EQUAL.
  • Some of these pieces might be the empty string l

44
Decomposing y
  • y has one extra b
  • Possible examples
  • b, babba, aabbbab, aaabbbb
  • Decomposition
  • y ubv where
  • u and v both have an equal number of as and bs
  • Decompose the 4 strings above into u, b, v
  • lbl, aabbbab, lbabba, aaabbbbl

45
Implication
  • Case 1 xay
  • y has one extra b
  • Case 1 refined xaubv
  • u, v belong to EQUAL
  • Production rule for this case?

46
Case 2 xbz
  • Case 2 xbz
  • z has one extra a
  • Case 2 refined xbuav
  • u, v belong to EQUAL
  • Production rule for this case?

47
Final Grammar
  • EG (V, S, S, P)
  • V S
  • S a,b
  • S S
  • P

48
EQUAL language
  • Proving CFG is correct

49
Is our grammar correct?
  • How do we prove our grammar is correct?
  • Informal
  • Test some strings
  • Review logic behind program (CFG) design
  • Formal
  • First, show every string derived by EG belongs to
    EQUAL
  • That is, show L(EG) is a subset of EQUAL
  • Second, show every string in EQUAL can be derived
    by EG
  • That is, show EQUAL is a subset of L(EG)
  • Both proofs will be inductive proofs
  • Inductive proofs and recursive algorithms go well
    together

50
L(EG) subset of EQUAL
  • Let x be an arbitrary string in L(EG)
  • What does this mean?
  • S gtEG x
  • Follows from definition of x in L(EG)
  • We will prove the following
  • If S gt1EG x, then x is in EQUAL
  • If S gt2EG x, then x is in EQUAL
  • If S gt3EG x, then x is in EQUAL
  • If S gt4EG x, then x is in EQUAL
  • ...

51
Base Case
  • Statement to be proven
  • For all n 1, if S gtnEG x, then x is in EQUAL
  • Prove this by induction on n
  • Base Case
  • n 1
  • What is the set of strings x S gt1EG x?
  • What do we need to prove about this set of
    strings?

52
Inductive Case
  • Inductive Hypothesis
  • For 1 j n, if S gtjEG x, then x is in EQUAL
  • Note, this is a strong induction hypothesis
  • Traditional inductive hypothesis would take form
  • For some n 1, if S gtnEG x, then x is in EQUAL
  • The difference is we assume the basic hypothesis
    for all integers between 1 and n, not just n
  • Statement to be Proven in Inductive Case
  • If S gtn1EG x, then x is in EQUAL

53
Regular induction vs Strong induction
  • Infinite Set of Facts
  • Fact 1
  • Fact 2
  • Fact 3
  • Fact 4
  • Fact 5
  • Fact 6
  • Base Case
  • Prove fact 1
  • Regular inductive case
  • For n 1,
  • Fact n --gt Fact n1
  • Strong inductive case
  • For n 1,
  • Fact 1 to Fact n --gt Fact n1

54
Visualization of Induction
Regular Induction
Strong Induction
Fact 1
Fact 1
Fact 2
Fact 2
Fact 3
Fact 3
Fact 4
Fact 4
Fact 5
Fact 5
Fact 6
Fact 6
Fact 7
Fact 7
Fact 8
Fact 8
Fact 9
Fact 9


55
Proving Inductive Case
  • If S gtn1EG x, then x is in EQUAL
  • Let x be an arbitrary string such that S gtn1EG
    x
  • Examining EG, what are the three possible first
    derivation steps
  • Case 1 S gt gtnEG x
  • Case 2 S gt gtnEG x
  • Case 3 S gt gtnEG x
  • One of the cases is impossible. Which one and
    why?

56
Case 2 S gt gtnEG x
  • This means x has the form aubv where
  • What can we conclude about u (dont apply IH)?
  • What can we conclude about v (dont apply IH)?
  • Apply the inductive hypothesis
  • u and v belong to EQUAL
  • Why do we need the strong inductive hypothesis?
  • Conclude x belongs to EQUAL
  • x aubv where u and v belong to EQUAL
  • Clearly the number of as in x equals the number
    of bs in x

57
Case 3 S gt gtnEG x
  • This means x has the form buav where
  • What can we conclude about u (no IH)?
  • What can we conclude about v (no IH)
  • Apply the inductive hypothesis
  • u and v belong to EQUAL
  • Why do we need the strong inductive hypothesis?
  • Conclude x belongs to EQUAL
  • x buav where u and v belong to EQUAL
  • Clearly the number of as in x equals the number
    of bs in x

58
L(EG) subset of EQUAL
  • Wrapping up inductive case
  • In all possible derivations of x, we have shown
    that x belongs to EQUAL
  • Thus, we have proven the inductive case
  • Conclusion
  • By the principle of mathematical induction, we
    have shown that L(EG) is a subset of EQUAL

59
EQUAL subset of L(EG)
  • Let x be an arbitrary string in EQUAL
  • What does this mean?
  • We will prove the following
  • If x 0 and x is in EQUAL, then x is in L(G)
  • If x 1 and x is in EQUAL, then x is in L(G)
  • If x 2 and x is in EQUAL, then x is in L(G)
  • If x 3 and x is in EQUAL, then x is in L(G)
  • ...

60
EQUAL subset of L(EG)
  • Statement to be proven
  • For all n 0, if x n and x is in EQUAL, then
    x is in L(EG)
  • Prove this by induction on n
  • Base Case
  • n 0
  • What is the only string x such that x0 and x
    is in EQUAL?
  • Prove this string belongs to L(EG)

61
Inductive Case
  • Inductive Hypothesis
  • For 0 j n, if x j and x is in EQUAL, then
    x is in L(EG)
  • Again, this is a strong induction hypothesis
  • Statement to be Proven in Inductive Case
  • For n 0,
  • if x n1 and x is in EQUAL, then x is in L(EG)

62
Proving Inductive Case
  • If xn1 and x is in EQUAL, then x is in L(EG)
  • Let x be an arbitrary string such that xn1
    and x is in L(EG)
  • Examining S, what are the two possibilities for
    the first character in x?
  • Case 1 first character in x is
  • Case 2 first character in x is
  • In each case, what can we say about the remainder
    of x?
  • Case 1 the remainder of x
  • Case 2 the remainder of x

63
Case 1 x ay
  • What can we say about y in this case?
  • This means x has the form aubv where
  • u is in EQUAL and has length n
  • v is in EQUAL and has length n
  • Proving this statement true
  • Consider all the prefixes of string y
  • length 0 l
  • length 1 y1
  • length 2 y1y2
  • length n y1y2 yn y

64
Case 1 x ay
  • Consider all the prefixes of string y
  • length 0 l
  • length 1 y1
  • length 2 y1y2
  • length n y1y2 yn y
  • The first prefix l has the same number of as as
    bs
  • The last prefix y has one extra b
  • The relative number of as and bs changes in the
    length i prefix differs by only one from the
    length i-1 prefix
  • Thus, there must be a first prefix t of y where t
    has one extra b
  • Furthermore, the last character of t must be b
  • Otherwise, t would not be the FIRST prefix of y
    with one extra b
  • Break t into u and b and let the remainder of y
    be v
  • The statement follows

65
Case 1 x aubv
  • x aubv
  • u is in EQUAL and has length n
  • v is in EQUAL and has length n
  • Apply the induction hypothesis
  • What can we conclude from applying the IH?
  • Why did we need a strong inductive hypothesis?
  • Conclude x is in L(EG) by constructing a
    derivation
  • S gt aSbS gtEG aubS gtEG aubv

66
Case 2 x buav
  • x buav
  • u is in EQUAL and has length n
  • v is in EQUAL and has length n
  • Apply the induction hypothesis
  • What can we conclude about u and v?
  • Conclude x is in L(EG) by constructing a
    derivation
  • S gt bSaS gtEG buaS gtEG buav
  • Justify each of the steps in this derivation

67
EQUAL subset of L(EG)
  • Wrapping up inductive case
  • For all possible first characters of x, we have
    shown that x belongs to L(EG)
  • Thus, we have proven the inductive case
  • Conclusion
  • By the principle of mathematical induction, we
    have shown that EQUAL is a subset of L(EG)

68
Module 31
  • Closure Properties for CFLs
  • Kleene Closure
  • construction
  • examples
  • proof of correctness
  • Others covered less thoroughly in lecture
  • union, concatenation
  • CFLs versus regular languages
  • regular languages subset of CFL

69
Closure Properties for CFLs
  • Kleene Closure

70
CFL closed under Kleene Closure
  • Let L be an arbitrary CFL
  • Let G1 be a CFG s.t. L(G1) L
  • G1 exists by definition of L1 in CFL
  • Construct CFG G2 from CFG G1
  • Argue L(G2) L
  • There exists CFG G2 s.t. L(G2) L
  • L is a CFL

71
Visualization
  • Let L be an arbitrary CFL
  • Let G1 be a CFG s.t. L(G1) L
  • G1 exists by definition of L1 in CFL
  • Construct CFG G2 from CFG G1
  • Argue L(G2) L
  • There exists CFG G2 s.t. L(G2) L
  • L is a CFL

CFL
72
Algorithm Specification
  • Input
  • CFG G1
  • Output
  • CFG G2 such that L(G2)

CFG G1
CFG G2
73
Construction
  • Input
  • CFG G1 (V1, S, S1, P1)
  • Output
  • CFG G2 (V2, S, S2, P2)
  • V2 V1 union T
  • T is a new symbol not in V1 or S
  • S2 T
  • P2 P1 union ??

74
Closure Properties for CFLs
  • Kleene Closure Examples

75
Example 1
V2 V1 union T T is a new symbol not in
V1 or SS2 TP2 P1 union T ? ST l
  • Input grammar
  • V S
  • S a,b
  • S S
  • P
  • S ? aa ab ba bb
  • Output grammar
  • V
  • S a,b
  • Start symbol is
  • P

76
Example 2
V2 V1 union T T is a new symbol not in
V1 or SS2 TP2 P1 union T ? ST l
  • Input grammar
  • V S, T
  • S a,b
  • Start symbol is T
  • P
  • T ? ST l
  • S ? aa ab ba bb
  • Output grammar
  • V
  • S a,b
  • Start symbol is
  • P

77
Closure Properties for CFLs
  • Kleene Closure Proof of Correctness

78
Is our construction correct?
  • How do we prove our construction is correct?
  • Informal
  • Test some strings
  • Review logic behind construction
  • Formal
  • First, show every string derived by G2 belongs to
    (L(G1))
  • That is, show L(G2) is a subset of (L(G1))
  • Second, show every string in (L(G1)) can be
    derived by G2
  • That is, show (L(G1)) is a subset of L(G2)
  • Both proofs will be inductive proofs
  • Inductive proofs and recursive algorithms go well
    together

79
L(G2) is a subset of (L(G1))
  • We want to prove the following
  • If x in L(G2), then x is in (L(G1))
  • This is equivalent to the following
  • If T gtG2 x, then x is in (L(G1))
  • The two statements are equivalent because
  • x in L(G2) means that T gtG2 x
  • We break the second statement down as follows
  • If T gt1G2 x, then x is in (L(G1))
  • If T gt2G2 x, then x is in (L(G1))
  • If T gt3G2 x, then x is in (L(G1))
  • ...

80
L(G2) is a subset of (L(G1))
  • Statement to be proven
  • For all n 1, if T gtnG2 x, then x is in
    (L(G1))
  • Prove this by induction on n
  • Base Case
  • n 1
  • Examining grammer G2, what is the only string x
    such that T gt1G2 x ?
  • Prove this string is in (L(G1))

81
Inductive Case
  • Inductive Hypothesis
  • For 1 j n, if T gtjG2 x, then x is in
    (L(G1))
  • Note, this is a strong induction hypothesis
  • Statement to be Proven in Inductive Case
  • For n above, if T gtn1G2 x, then x is in
    (L(G1))
  • Proving this statement
  • Let x be an arbitrary string such that T gtn1G2
    x
  • Examining G2, what are the two possible first
    derivation steps?
  • Case 1 T gtG2 gtnG2 x
  • Case 2 T gtG2 gtnG2 x

82
Case Analysis
  • Case 1 T gtG2 gtn x is not possible
  • Why not?
  • Case 2 T gtG2 gtnG2 x
  • This means x has the form uv where
  • What can we say about u (no IH)?
  • What can we say about v (no IH)?
  • Applying the inductive hypothesis, what can we
    conclude?

83
Concluding Case 2 T gtG2 gtnG2 x
  • Concluding string u belongs to L(G1)
  • Follows from S gt G2 u and
  • Our construction insures that all strings derived
    from S in L(G2) are also in L(G1)
  • How do we conclude that x belongs to (L(G1))
  • Wrapping up inductive case
  • In all possible derivations of x, we have shown
    that x belongs to (L(G1))
  • Thus, we have proven the inductive case
  • Conclusion
  • By the principle of mathematical induction, we
    have shown that L(G2) is a subset of (L(G1))

84
(L(G1)) is a subset of L(G2)
  • We want to prove the following
  • If x is in (L(G1)), then x is in L(G2)
  • This is equivalent to the following
  • If x is in (L(G1)), then T gtG2 x
  • The two statements are equivalent because
  • x in L(G2) means that T gtG2 x
  • We break the second statement down as follows
  • If x is in (L(G1))0, then T gtG2 x
  • If x is in (L(G1))1, then T gtG2 x
  • If x is in (L(G1))2, then T gtG2 x
  • ...

85
(L(G1)) is a subset of L(G2)
  • Statement to be proven
  • For all n 0, if x is in (L(G1))n, then x is in
    L(G2)
  • Prove this by induction on n
  • Base Case
  • n 0
  • What is the only string x in (L(G1))0?
  • Show this string belongs to L(G2)

86
Inductive Case
  • Inductive Hypothesis
  • For n 0, if x is in (L(G1))j, then T gtG2 x
  • Note, this is a normal induction hypothesis
  • Statement to be Proven in Inductive Case
  • For n 0, if x is in (L(G1))n1, then T gtG2
    x
  • Proving this statement
  • Let x be an arbitrary string in (L(G1))n1
  • This means x uv where
  • u in L(G1)
  • What can we say about v?

87
Deriving x
  • x uv where
  • u is a string in L(G1)
  • v is a string in
  • Justify all the steps in the following derivation
  • T gt G2 ST gt G2 Sv gt G2 uv x
  • First step
  • Second step
  • Third step
  • Thus T gt G2 x
  • The inductive case follows
  • The result is proven by the principle of
    mathematical induction

88
Construction for Set Union
  • Input
  • CFG G1 (V1, S, S1, P1)
  • CFG G2 (V2, S, S2, P2)
  • Output
  • CFG G3 (V3, S, S3, P3)
  • V3 V1 union V2 union T
  • Variable renaming to insure no names shared
    between V1 and V2
  • T is a new symbol not in V1 or V2 or S
  • S3 T
  • P3

89
Construction for Set Concatenation
  • Input
  • CFG G1 (V1, S, S1, P1)
  • CFG G2 (V2, S, S2, P2)
  • Output
  • CFG G3 (V3, S, S3, P3)
  • V3 V1 union V2 union T
  • Variable renaming to insure no names shared
    between V1 and V2
  • T is a new symbol not in V1 or V2 or S
  • S3 T
  • P3

90
CFLs and regular languages
91
CFL Closure Properties
  • What have we just proven
  • CFLs are closed under Kleene closure
  • CFLs are closed under set union
  • CFLs are closed under set concatenation
  • What can we conclude from these 3 results?
  • It follows that regular languages are a subset of
    CFLs

92
Regular languages subset of CFL
  • Recursive definition of regular languages
  • Base Case
  • , l, a, b are regular languages over
    a,b
  • P, PS ? l, PS ? a, PS ? b
  • Inductive Case
  • If L1 and L2 are are regular languages, then L1,
    L1L2, L1 union L2 are regular languages
  • Use previous constructions to see that these
    resulting languages are also context-free

93
Other CFL Closure Properties
  • We will show that CFLs are NOT closed under many
    other set operations
  • Examples include
  • set complement
  • set intersection
  • set difference

94
Language class hierarchy
REG
95
Module 32
  • Pushdown Automata (PDAs)
  • definition
  • Example
  • We define configurations and computations of
    PDAs
  • We define L(M) for PDAs

96
Pushdown Automata
  • Definition and Motivating Example

97
Pushdown Automata (PDA)
  • In this presentation we introduce the PDA model
    of computation (programming language).
  • The key addition to a PDA (from an NFA-/\) is the
    addition of external memory in the form of an
    infinite capacity stack
  • The word pushdown comes from the stacks of
    trays in cafeterias where you have to pushdown on
    the stack to add a tray to it.

98
NFA for ambn m,n 0
  • Consider the language anbn n 0.
  • This NFA can recognize strings which have the
    correct form,
  • as followed by bs.
  • However, the NFA cannot remember the relative
    number of as and bs seen at any point in time.
  • What strings end up in each state of the above
    NFA?
  • I
  • B
  • C

99
PDA for anbn n 0
Imagine we now have memory in the form of a stack
which we can use to help remember how many as we
have seen by pushing onto and popping from the
stack When we see an a in state I, we do the
following two actions 1) We push an a on the
stack. 2) We stay in state I. When we see a b
in state B, we do the following two actions 1)
We pop an a from the stack. 2) We stay in state
B. From state B, we allow a /\-transition to
state C only if 1) The stack is empty. Finally,
when we begin, the stack should be empty.

100
Formal PDA definition
  • PDA M (Q, S, G, q0, Z, A, d)
  • Modified elements
  • G is the stack alphabet
  • Z is a special character that is initially on the
    stack
  • Often used to represent an empty stack
  • d is modified as follows
  • Pop to read the top character on the stack
  • Stack update action
  • What to push back on the stack
  • If we push /\, then the net result of the action
    is a pop

101
Example PDA
  • Q I, B, C
  • S a,b
  • G Z, a
  • q0 I
  • Z is the initial stack character
  • A C
  • d
  • S a TopSt NS stack update
  • I a a I push aa
  • I a Z I push aZ
  • I /\ a B push a
  • I /\ Z B push Z
  • B b a B push /\
  • B /\ Z C push Z

102
Computing with PDAs
  • Configurations change compared with NFA-/\s
  • Configuration components
  • current state
  • remaining input to be processed
  • stack contents
  • Computations are essentially the same as with
    NFA-/\s given the modified configurations
  • Determining which transitions of a PDA can be
    applied to a given configuration is more
    complicated though

103
Computation Graph of PDA
Computation graph for this PDA on the input
string aabb
Q I, B, C S a,b G Z, a q0 I Z is
the initial stack character A C d S
a TopSt NS stack update I a a I push
aa I a Z I push aZ I /\ a B push a I /\ Z B push
Z B b a B push /\ B /\ Z C push Z
(I,aabb,Z)
104
Definition of
Input string aabb
(I, aabb, Z) (I,abb,aZ) (I, aabb, Z) (B,
aabb, Z) (I, aabb, Z) 2 (C, aabb, Z) (I, aabb,
Z) 3 (B, bb, aaZ) (I, aabb, Z) (B, abb,
aZ) (I, aabb, Z) (B, /\, Z) (I, aabb, Z)
(C, /\, Z)
105
Acceptance and Rejection
Input string aabb
M accepts string x if one of the configurations
reached is an accepting configuration (q0, x,
Z) (f, /\, a),f in A, a in G Stack contents
can be anything M rejects string x if all
configurations reached are either not halting
configurations or are rejecting configurations
106
Defining L(M) and LPDA
  • L(M) (or Y(M))
  • The set of strings ?
  • N(M)
  • The set of strings ?
  • LPDA
  • Language L is in language class LPDA iff ?

M accepts string x if one of the configurations
reached is an accepting configuration (q0, x,
Z) (f, /\, a),f in A, a in G Stack contents
can be anything M rejects string x if all
configurations reached are either not halting
configurations or are rejecting configurations
107
Deterministic PDAs
  • A PDA is deterministic if its transition function
    satisfies both of the following properties
  • For all q in Q, a in S union /\, and X in G,
  • the set d(q,a,X) has at most one element
  • For all q in Q and X in G,
  • if d(q, /\, X) ? , then d(q,a,X) for all
    a in S
  • A computation graph is now just a path again
  • Our default assumption is that PDAs are
    nondeterministic

108
Two forms of nondeterminism
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a Z q0 aa 3 q0
/\ Z q0 aZ 4
q0 a Z q0
aa
109
LPDA and DCFL
  • A language L is in language class LPDA if and
    only if there exists a PDA M such that L(M) L
  • A language L is in language class DCFL
    (Deterministic Context-Free Languages) if and
    only if there exists a deterministic PDA M such
    that L(M) L
  • To be proven
  • LPDA CFL
  • CFL is a proper superset of DCFL

110
PDA Comments
  • Note, we can use the stack for much more than
    just a counter
  • See examples in chapter 7 for some details

111
Module 33
  • Pushdown Automata (PDAs)
  • Another example

112
Palindromes
  • Let PAL be the set of palindromes over a,b
  • Let PAL1 be the following related language
  • wcwr w consists only of as and bs
  • we add c to the input alphabet as a special
    marker character
  • Strings in PAL1
  • aca, bcb, abcba, aabcbaa, c
  • strings not in PAL1
  • aaca, aaccaa, abccba, abcb, abba
  • Let PAL2 be the set of even length palindromes
  • wwr w consists only of as and bs

113
PAL1
  • Lets first construct a PDA for PAL1
  • Basic ideas
  • Have one state remember first half of string
  • Have one state match second half of string to
    first half
  • Transition between these two states when the
    first c is encountered

114
PDA for PAL1
  • M (Q, S, G, q0, Z, A, d)
  • Q q0, qm, qf
  • S a, b, c
  • G Z, a, b
  • q0 q0
  • Z Z
  • A qf

115
Transition Function
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
c Z qm Z 8
q0 c a qm
a 9 q0 c b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
First three transitions push a on top of the
stack Second three transitions push b on the
stack Third three transitions switch state q0 to
qm No change to stack Transitions 10 and 11
match characters from first and last half of
input string
116
Notation comment
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
c Z qm Z 8
q0 c a qm
a 9 q0 c b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
  • We might represent transition 1 in two other ways
  • d(q0,a,Z) (q0, aZ)
  • (q0, a, Z, q0, aZ)
  • Question
  • Is this PDA deterministic?

117
Computation Graph 1
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
c Z qm Z 8
q0 c a qm
a 9 q0 c b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
(q0, abcba, Z)
118
Computation Graph 2
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
c Z qm Z 8
q0 c a qm
a 9 q0 c b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
(q0, abcab, Z)
119
Computation Graph 3
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
c Z qm Z 8
q0 c a qm
a 9 q0 c b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
(q0, acab, Z)
120
PAL2
  • Lets now construct a PDA for PAL
  • What is harder this time?
  • When do we switch from putting strings on the
    stack to matching?
  • Example
  • After seeing aab, should we switch to match mode
    or stay in stack mode?
  • Solution
  • Do both using nondeterminism

121
PDA for PAL2
  • M (Q, S, G, q0, Z, A, d)
  • Q q0, qm, qf
  • S a, b
  • G Z, a, b
  • q0 q0
  • Z Z
  • A qf

122
Transition Relation
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
l Z qm Z 8
q0 l a qm
a 9 q0 l b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
First three transitions push a on top of the
stack Second three transitions push b on the
stack Third three transitions switch state q0 to
qm Is the PDA deterministic or nondeterministic?
123
Computation Graph 1
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
l Z qm Z 8
q0 l a qm
a 9 q0 l b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
(q0, abba, Z)
124
Computation Graph 2
Trans Current Input Top of Next Stack
State Char. Stack State
Update -------------------------------------------
------------ 1 q0 a
Z q0 aZ 2 q0
a a q0 aa 3 q0
a b q0 ab 4
q0 b Z q0
bZ 5 q0 b a
q0 ba 6 q0 b
b q0 bb 7 q0
l Z qm Z 8
q0 l a qm
a 9 q0 l b
qm b 10 qm a
a qm l 11 qm b
b qm l 12 qm
l Z qf Z
(q0, aba, Z)
125
PAL
  • Challenge
  • Construct a PDA for PAL
  • First step
  • Construct a PDA for odd length palindromes
  • Then
  • Combine PDAs for odd length and even length
    palindromes

126
Module 34
  • CFG ? PDA construction
  • Shows that for any CFL L, there exists a PDA M
    such that L(M) L
  • The reverse is true as well, but we do not prove
    that here

127
CFL subset LPDA
  • Let L be an arbitrary CFL
  • Let G be the CFG such that L(G) L
  • G exists by definition of L is CF
  • Construct a PDA M such that L(M) L(G)
  • Argue L(M) L
  • There exists a PDA M such that L(M) L
  • L is in LPDA
  • By definition of L in LPDA

128
Visualization
  • Let L be an arbitrary CFL
  • Let G be the CFG such that L(G) L
  • G exists by definition of L is CF
  • Construct a PDA M such that L(M) L
  • M is constructed from CFG G
  • Argue L(M) L
  • There exists a PDA M such that L(M) L
  • L is in LPDA
  • By definition of L in LPDA

CFL
LPDA
129
Algorithm Specification
  • Input
  • CFG G
  • Output
  • PDA M such that L(M)

CFG G
PDA M
130
Construction Idea
  • The basic idea is to have a 2-phase PDA
  • Phase 1
  • Derive all strings in L(G) on the stack
    nondeterministically
  • Do not process any input while we are deriving
    the string on the stack
  • Phase 2
  • Match the input string against the derived string
    on the stack
  • This is a deterministic process
  • Move to an accepting state only when the stack is
    empty

131
Illustration
1. Derive all strings in L(G) on the stack2.
Match the derived string against input
  • Input Grammar G
  • V S
  • S a,b
  • S S
  • P
  • S ? aSb l
  • What is L(G)?

Illustration of how the PDA might work, though
not completely accurate.
(q0, aabb, Z) / put
Write a Comment
User Comments (0)
About PowerShow.com