Closure Properties of Regular Languages - PowerPoint PPT Presentation

About This Presentation
Title:

Closure Properties of Regular Languages

Description:

(4) Construct a CFG from G1 by changing each rule A to A R, i. e., reverse right ... We construct a CFG G' = (VT ,VN ,P', S) from G by the following steps. ... – PowerPoint PPT presentation

Number of Views:795
Avg rating:3.0/5.0
Slides: 24
Provided by: par2
Learn more at: http://www.cs.rpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Closure Properties of Regular Languages


1
Closure Properties of Regular Languages
  • Regular languages are closed under many set
    operations. Let L1 and L2 be regular
  • languages.
  • (1) L1 ? L2 (the union) is regular.
  • (2) L1L2 (the concatenation) is
    regular.
  • (3) L1? (the Kleene star) and L1
    (the Kleene plus) are regular.
  • (4) L1R (the reversed language)
    is regular.
  • (5) L1 (the complement) is
    regular.
  • (6) L1? L2 (the intersection) is regular.
  • (7) L1 - L2 (the set subtraction) is
    regular.

2
Proof of the Closure Properties
  • We can either use regular grammars, FA, or
    regular expressions for the simplicity of the
    proof.
  • Let r1 and r2 be regular expressions that,
    respectively, express the languages L1 and L2 .
  • Clearly, r1 r2 is a regular expression which
    denotes the union of two languages L1 and L2,
    respectively, denoted by r1 and r2. Since every
    regular expression denotes a regular language,
    L1 ? L2 is regular.
  • We can also constructively prove this
    property as follows let G1 ( VN1, VT1, P1, S1
    ) and G2 ( VN2, VT2 , P2, S2 ) be regular
    grammars that generate L1 and L2 , respectively.
    Without loss of generality, assume that VN1 and
    VN2 are disjoint, i.e., VN1 ? VN2 ?.
    Otherwise, we can always convert the given
    grammars to the ones that satisfy such property.
    Construct a regular grammar G with production
    rules S ? S1 S2 and all the rules in P1
    and P2. Clearly, L(G) L1 ? L2 .
  • (2) Clearly r1r2 is a regular expression which
    denotes the language L1L2, which means L1L2 is
    regular.
  • (3) Let r1 be regular expression for L1. Clearly,
    (r1) is regular expression for L1? Since L1
    L1? - ?, by property (7) that will be
    proved, L1 is regular.

3
Proof of the closure Properties (conted)
(4) Suppose that the following FSA M1 accepts L1.
We modify M1 as shown below. Clearly,
the resulting automaton recognizes the reversed
language of L1.
4
Proof of the Closure Properties (conted)
  • You can also prove part (4) using a regular
    grammar G1 using another form of regular
    grammars, where the production rules are
    restricted to the form either A ? Bx, or A ?
    x, where A and B are arbitrary nonterminal
    symbols, and x is a string of terminal symbol, or
    ?. (Recall that we chose to restrict to A ? xB,
    or A ? x.) If we reverse the right side of
    each production rule, then the resulting grammar
    G generates L1R .
  • (5) As for part (4), we modify the finite
    transition graph M1 of an automaton that
    recognizes L1 as follows.
  • Add the dead state, if it is not shown in the
    transition graph. (Recall that we usually do not
    show the dead state for convenience.)
  • Change accepting states to non-accepting states
    and non-accepting states to accepting states.
  • (6) Since L1 ? L2 L1 ? L2 L1 ? L2 , and
    regular language are closed under union and
    complementation (properties (1) and (5) above),
    L1 ? L2 is regular.
  • (7) Since L1 - L2 L1 ? L2 , it is regular
    by properties (5) and (6)above.

5
Properties of Context-free Languages
  • Let L1 and L2 be CFLs.
  • (1) L1 ? L2 (the union) is CFL.
  • (2) L1L2 (the concatenation) is CFL.
  • (3) L1 (the Kleene star) and L1 (the Kleene
    plus) are CFL.
  • (4) L1R (the reversed language) is CFL.
  • (5) L1 ? L2 (the intersection) is not
    necessarily CFL.
  • (6) L1 (the complement) is not necessarily CFL.

6
Proof of the Context-free Language Properties
  • Let G1 ( VN1, VT1, P1, S1 ) and G2 (
    VN2, VT2 , P2, S2 ) be CF grammars that
  • generate L1 and L2 , respectively. Without
    loss of generality, assume that VN1 and
  • VN2 are disjoint, i.e., VN1 ? VN2 ?.
    (Otherwise, we can modify them.)
  • (1) Construct a CFG G by merging the rules of
    grammars G1 and G2 and adding new rules S ? S1
    S2. (This is the same technique for regular
    languages.)
  • (2) Construct a CFG G by merging the rules of G1
    and G2 and adding a new rule
  • S ? S1S2.
  • (3) For L1 add rules S ? S1S ? in grammar
    G1. For L1 add rules S ? S1S S1
  • ,where S is new start symbol.
  • (4) Construct a CFG from G1 by changing each rule
    A ? ? to A ? ?R, i. e., reverse right side of
    each production rule.

7
Proof of the Context-free Language Properties
(conted)
  • (5) We know that L1 a ib ic j ? i, j ? 0
    and L2 a k b nc n ? k, n ? 0
  • are CFLs. But L1 ? L2 a i b i c i ?
    i ? 0 is not CFL.
  • (6) Suppose that CFLs are closed under
    complementation. Since CFLs are
  • closed under union (property (1)), and L1 ?
    L2 L1 ? L2 , which implies
  • CFLs are closed under intersection. This
    contradicts to the proven fact of
  • property (5).

8
Minimizing the Number of ?-Production Rules
  • Theorem. Given an arbitrary CFG G, we can
    construct a CFG G such that
  • L(G) L(G) and if ? is not in L(G), then G
    dose not have ?- production rule.
  • If ? ? L(G), then S ? ? is the only ?-production
    rule of G.
  • Proof (an algorithm). Let G (VT, VN, P, S), and
    let A, B ? VN. We construct a CFG G (VT ,VN
    ,P, S) from G by the following steps.
  • (1) Find the set W of all nonterminals of G which
    derive ? as follows
  • W0 A A ? VN and A ? ? is in P
  • Do Wi1 Wi ?A A ? VN and A ? ? is in
    P, for some ? ? Wi
  • until (Wi1 Wi)
  • W Wi //W contains all nonterminal
    symbols from which ? can be derived.
  • (2) Delete all ?-production from P. Call this new
    set of productions P1.
  • (3) Modify P1 to P as follows If a production A
    ? ? is in P1, then put the rules A ? ? and A ?
    ? into P, for all ? (? ? ) which are obtained
    from ? by deleting one or more nonterminals in
    the set W constructed by step (1).
  • (4) If S is in W, then add S ? ? in P.

9
Minimizing the Number of ?-Production Rules
(example)
Convert the following CFG G to another CFG G
such that L(G) L(G) and G has the smallest
possible number of ?-production rules. G S
?ADC EFg A ?aA ? D ?
FGH b C ?c ? E
? a F ? f ?
G ? Gg H H ? h ?

Computing W W0 A, C, F, H W1 W0 ?
G A, C, F, G, H W2 W1 ? D A, C,
D, F, G, H W3 W2 ? S A, C, D, F, G,
H, S W4 W3 ? A, C, D, F, G, H,
S P1 S ?ADC EFg A ?aA
D ? FGH b C ?c
E ? a F ? f
G ? Gg H H ? h P S
?ADC AD AC DC A D C ? EFg Eg
A ?aA a D ? FGH FG
FH GH F G H b C ?c
E ? a F ? f
G ? Gg g H H ? h
10
Eliminating Useless Symbols from a CFG
  • Lemma 1. Given a CFG G (VT , VN , P, S), we can
    construct an equivalent CFG G (VT , VN , P,
    S), such that every nonterminal symbol A in VN
    derives a string x ? (VT)
  • Proof. Let OLDV and NEWV be sets of
    nonterminals, and A be an arbitrary nonterminal.
    We construct VN and P as follows.
  • OLDV ? NEWV A A ? w is in P for
    some w ? (VT)
  • while (OLDV ? NEWV) do
  • OLDV NEWV
  • NEWV OLDV ?A A ? ? for some ? in (VT ?
    OLDV)
  • VN NEWV P A ? ? A ? ? is in P and
    ? ?(VN ? VT)

11
Eliminating Useless Symbols from a CFG(conted)
Lemma 2. Given a CFG G (VT,VN, P, S), we can
construct an equivalent CFG G (VT
,VN , P, S), such that, for each symbol X ? VT
? VN , the start symbol derives ?X?, for some ?,
? ? (VT ? VN), i.e., S can derive a
sentential form (a string of terminals and
nonterminals) which contains symbol X. Proof.
The following algorithm computes VT, VN and
P. (1) Let VT and VN be the empty sets.
(2) Put S into VN. (3) If A ? VN is put into
VN and A ? ?1 ?2 .... ?n , then all
nonterminals in ?i, 1 ? i ? n, are put into
VN and all terminals in are put into VT.
(4) Repeat (3) until there is no symbol to be
added to VN . (5) Let P contain all the
productions in P except for the ones which have a
symbol not in VT ? VN.

12
Eliminating Useless Symbols from a CFG (conted)
  • Theorem. Given arbitrary CFG G (VT , VN , P,
    S), we can construct an
  • equivalent CFG G (VT , VN , P, S), such
    that,
  • (1) for each A ? VN , A ? (V)T (i.e., A
    derives a terminal string or ? ), and
  • (2) for each X ? VT ? VN , S ? ?X?, for some
    ?, ? ? VN ? (V)T ,
  • (i.e., the start symbol can drive a
    sentential form which contains X).
  • Proof. Use Lemmas 1 and 2.

13
Eliminating Useless Symbols from a CFG (example)
Example. Eliminate useless symbols from the
following CFG G. G S ?AD EFg A
?aGD D ? FGd C
?cCEc E ? Ee F
? Ff ? G ? Gg g H
? hH h
Step 1 Apply Lemma 1 to find the set of
nonterminals VN such that every nonterminal
symbol in VN derives a string x ? (VT). OLDV
NEWV F, G, H OLDV NEWV NEWV OLDV
? D D, F, G, H OLDV NEWV NEWV OLDV
? A A, D, F, G, H OLDV NEWV NEWV
OLDV ? S A, D, F, G, H, S OLDV NEWV
NEWV OLDV ? A, D, F, G, H, S VN
NEWV A, D, F, G, H, S Find the set of rules
P . P S ?AD A ?aGD
D ? FGd F ? Ff ? G ? Gg
g H ? hH h
14
  • P S ?AD A ?aGD D ?
    FGd F ? Ff ?
  • G ? Gg g H ? hH h
  • Step 2 Find the set of symbols V VT ? VN
    such that each symbol in V can be derived
    starting from S.
  • VT VN // initialize with empty set
  • VN VN ?S VT VT ?
  • VN VN ?A, D S, A, D VT VT ?
  • VN VN ?G, F S, A, D, G, F VT
    VT ? a, d a, d
  • VN VN ? S, A, D, F, G VT VT ?
    a, d a, d, g, f
  • VN VN ? S, A, D, F, G VT VT ?
    a, d, g, f
  • Cleaned set of rules
  • P S ?AD A ?aGD D ?
    FGd F ? Ff ?
  • G ? Gg g

15
Eliminating Useless Symbols from a CFG (conted)
Remark Notice that applying Lemma 2 first and
then Lemma 1 may fail to eliminate all useless
productions. Example. Consider grammar with rules
P S ? AB a A ? a By applying Lemma 1
first, we have P S ? a A ? a , then
applying Lemma 2, we have P S ? a .
However, if we apply Lemma 2 first, we have P
S ? AB a A ? a . Then applying Lemma 1,
we have P S ? a A ? a , which still has
a useless production.
16
Ambiguous Context-free Grammars
  • There are two kinds of ambiguities in a
    language.
  • Lexical ambiguity (or semantic ambiguity) A
    symbol or an expression has more than one meaning
    (e.g., story, saw).
  • Syntactic ambiguity (or structural ambiguity) An
    expression can be parsed in two different ways. A
    CFG is ambiguous if the language has a string for
    which there are more than one parse tree.

For a given context-free grammar G and a string
x, the parse tree shows how x is derived with the
rules of G (see an example on the next slide). In
programming language different parse trees give
different object codes. In this course we will
only study syntactic ambiguity of context-free
grammars.
Example 1 (in natural language). A man entered
a room with a picture can be interpreted in two
different ways.
17
Ambiguous Context-free Grammars (conted)
  • Example 2 (in formal language). The following
    context-free grammar is ambiguous, because it
  • has two parse trees shown in Figures (a) and (b)
    below for string p ? q ? r.
  • G S ? S?S?S?S??S?A A ?
    p?q?r

18
Some Techniques for Designing Unambiguous CFG
(1) Use parenthesis such that each derivation
tree generates unique string. Notice that
this technique changes the language by
introducing new terminal symbols, the
parentheses. Example Ambiguous G1 S ?
S?S?S?S??S?A A ? p?q?r
Unambiguous G2 S ? (S?S)?(S?S)??(S) ?A
A ? p?q?r
19
Some Techniques for Designing Unambiguous CFG
  • (2) Modify the production rules that cause the
    ambiguity.
  • Examples (a) Grammar G3 below is clearly
    ambiguous grammar because it can
  • either generate left side b first and then right
    side b or vice versa for string bcb.
  • Grammar G4 doesnt have this possibility because
    it generates left side bs first,
  • if any. Ambiguous G3 S ?
    bS?Sb?c
  • Unambiguous G4 S ?
    bS?A A ? Ab?c

Figure (a). Ambiguity of G3
Figure (b). Unambiguous G4.
20
Some Techniques for Designing Unambiquous
CFG(conted)
  • (b) The following grammar G5 is ambiguous, since
    it can generate ? in two ways.
  • We eliminate this possibility by applying the
    technique of reducing ?-production
  • rules. Grammar G6 is the result .
  • G5 S ? B?D B ? bBc??
    D ? dDe??
  • G6 S ? B?D?? B ? bBc?bc
    D ? dDe?de
  • (c) Grammar G1 can be modified in two different
    ways to make it unambiguous.
  • Notice that for G7 we used the same technique
    for Example (a) above.
  • G7 S ? A?S?A?S??S?A A ? p?q?r
  • G8 S ? D?S?D D ? C ? D?C C
    ? ?C?A A ? p?q?r
  • For G8 we set up a precedence rule such that ?,
    if any, is derived (by S) first, then
  • ? (by D) and ? in that order from the top of
    the parse tree. The later an operator is
  • derived the higher precedence it has over the
    others.

21
Known facts about ambiguous context-free grammars.
  • There is no algorithm that can tell whether an
    arbitrary CFG is ambiguous or not.
  • There is so called inherently ambiguous
    context-free languages for which every CFG is
    ambiguous. Here is an example.
  • anbncmdm ? n, m ? 1 ? anbmcmdn ? n, m ? 1.
  • There is no algorithm that can convert an
    arbitrary ambiguous CFG, which is not inherently
    ambiguous, to an unambiguous one.

22
Normal Forms of Context-free Grammars
When we investigate context-free grammars and
their languages, sometimes it is convenient to
make the right side of each production rule meet
certain form. Such form is called normal form.
There are two normal forms for context-free
grammars Chomsky Normal Form(CNF) and Greibach
Normal Form(GNF). Let G (VN, VT, P, S) be a
context-free grammar. Grammar G is in CNF, if all
the production rules of the grammar are of the
form A ? BC or A ? a, where A, B, C?VN, a ?VT. A
context-free grammar is in GNF, if every
production rule of the grammar is of the form A ?
a?, where A ? VN , a ?VT , and ? ? (VN). Notice
that ? is a string of nonterminal symbols or a
null string. We can show that every context-free
grammar whose language does not contain ? can be
converted to CNF and GNF. (Recall that we can
eliminate all ?-production rules from a given
context-free grammar, if its language does not
contain ?.) The following example shows how to
convert a context-free grammar to CNF. We can
easily generalize the idea. Converting a
context-free grammar to GNF is quite involved
(see the text Chapter 6). We shall not study the
proof.
23
Converting a Context-free Grammar to CNF(example)
  • Suppose that a context-free grammar has a
    production rule A ? aBCDbE, which is not in CNF.
    We introduce new nonterminal symbols and
    production rules in CNF such that A can derive
    the right side string aBCDbE as follows
  • A ? A1B1 A1 ? a // and we let B1
    derive BCDbE as follows
  • B1 ? BC1 //
    and we let C1 derive CDbE as follows
  • C1 ? CD1 // and we let
    D1 derive DbE as follows
  • D1 ? DE1 // and we let
    E1 derive bE as follows
  • E1 ? F1E F1 ? b // and we let E1
    derive bE as follows

Example. Convert the following context-free
grammar to CNF. S ? AaBCb A ? abb
B ? aC C ? aCb ac
Answer S ? AA1 A1 ? A2A3 A2 ? a A3
? BA4 A4 ? CA5 A5 ? b A ? B1B2
B1? a B2 ? B3B4 B3 ? b B4 ? b B ?
C1C C1 ? a C ? D1D2 E1E2 D1 ?
a D2 ? CD3 D3 ? b E1 ? a E2 ?
c
Write a Comment
User Comments (0)
About PowerShow.com