Title: CSE 3813 Introduction to Formal Languages and Automata
1CSE 3813Introduction to Formal Languages and
Automata
- Chapter 4
- Properties of Regular Languages
- These class notes are based on material from our
textbook, An Introduction to Formal Languages and
Automata, 4th ed., by Peter Linz, published by
Jones and Bartlett Publishers, Inc., Sudbury, MA,
2006. They are intended for classroom use only
and are not a substitute for reading the textbook.
2Properties of regular languages
- What happens when we perform operations on
regular languages? - E.g., if we concatenate two regular languages, is
the resulting language also regular? - Can we decide whether a given language has a
certain property or not? - E.g., Can we tell if a certain language is finite
or not? - Can we tell whether a given language is regular
or not?
3Closure properties of regular languages
- Definition A regular language is any language
that is accepted by a finite automaton - Theorem 4.1 The class of regular languages is
closed under the following operations (that is,
performing these operations on regular languages
creates other regular languages) - Union
- Concatenation
- Kleene star
- Complementation
- Intersection
- Difference
4Closure for union, concatenation, and Kleene star
- If L1 and L2 are regular languages, then there
exist regular expressions r1 and r2 such that L1
L(r1) and L2 L(r2). - By definition 3.1.2 in our text r1r2 , r1r2,
and r are regular expressions, and - L1 ? L2 L(r1r2)
- L1L2 L(r1r2)
- L1 L(r)
5Closure for union, concatenation, and Kleene star
- Since languages represented by regular
expressions are by definition regular, performing
the operations of union, concatenation, and
star-closure on regular languages produces
regular languages. - We say that the class of regular languages is
closed under union, concatenation, and Kleene
star (star-closure).
6So
- The null language ? is regular
- The language consisting of the empty string, ?,
is regular - For each a in ?, a is regular
- If L1 and L2 are regular
- L1 ? L2 is regular
- L1L2 is regular
- L1 is regular
7Unions, Intersections, and Complements Theorem
4.1
- Suppose that
- M1 (Q1, ?, ?1, q1, F1) accepts language L1,
and - M2 (Q2, ?, ?2, q2, F2) accepts language L2
- Let M be an FA defined by M (Q, ?, ?, q0, F)
where - Q Q1 ? Q2
- q0 (q1, q2)
- and the transition function ? is defined by
- ?((p, q), a) (?1(p, a), ?2(q, a)),
- for any p ? Q1, q ? Q2, and a ? ?
8Unions, Intersections, and Difference Theorem 4.1
- Then
- If F (p, q) ? p ? F1 or q ? F2, M accepts the
language L1 ? L2 - If F (p, q) ? p ? F1 and q ? F2, M accepts
the language L1? L2 - If F (p, q) ? p ? F1 and q ? F2, M accepts
the language L1 ? L2
9Theorem 4.1
- Proof
- For any x ? ? and any (p, q) ? Q
- ?((p, q), x) (?1(p, x), ?2(q, x))
- A string x is accepted by M iff
- ?((q1, q2), x) ? F
- By our formula, this is true only if
- (?1(q1, x), ?2(q2, x)) ? F
10Theorem 4.1
- Proof (continued)
- For Case 1, this is equivalent to saying that
- ?1(q1, x) ? A1 or ?2(q2, x) ? A2
- Which is equivalent to
- x ? L1 ? L2
- Cases 2 and 3 are similar
11Complement
- Consider the special case in which L1 is all of
?. Here, L1 L2 is actually L2 (the
complement of L2)
12Reversal
- Theorem 4.2 The family of regular languages is
closed under reversal. - Proof If L is a regular language, construct an
NFA with a single final state that accepts it.
Now change the initial vertex into a final
vertex, the final vertex into the initial vertex,
and reverse the direction on all the edges. For
every string w accepted by the original NFA, the
modified version of the NFA accepts wR.
13Homomorphism
- Definition 4.1
- A homomorphism is a substitution in which a
single letter is replaced with a string.
Formally, if ? and ? are alphabets, then a
function - h ? ? ?
- is a homomorphism.
- If L is a language on S, then its homomorphic
image is - h(L) h(w) w ? L
14Homomorphism
- Theorem 4.3
- If L is a regular language, then its homomorphic
image h(L) is also regular. - Thus the family of regular languages is closed
under homomorphism.
15Right quotient
- To form the right quotient of L1 with L2, L1/L2,
take all strings in L1 that have a suffix
belonging to L2 and remove the suffix. - Example
- L1 ab, aab, aaab, aaaab
- L2 b
- L1/L2 a, aa, aaa, aaaa
16Right quotient
- Theorem 4.4
- If L1 and L2 are regular languages, then L1/L2 is
also regular. Thus the family of regular
languages is closed under right quotient with
another regular language. - Proof By construction see textbook, pp.
106-107.
17The membership question
- Given a language L and a string w, is w ? L?
- A method for answering the membership question is
called a membership algorithm. Is there a
membership algorithm for regular languages?
18The membership question
- Theorem 4.5 Given a standard representation
(i.e., a finite automaton, a regular expression,
or a regular grammar) of any regular language L
on ? and w ? ?, there exists an algorithm for
determining whether w is in L. - Proof Here is the algorithm
- If the standard representation of L is in the
form of a regular expression, or a regular
grammar, construct an equivalent FA. - Test w to see if it is accepted by the FA.
19The finiteness question
- Theorem 4.6
- Given a standard representation (i.e., a finite
automaton, a regular expression, or a regular
grammar) of any regular language L on ?, there
exists an algorithm for determining whether L is
empty, finite, or infinite.
20The finiteness question
- Proof Here is the algorithm
- If the standard representation of L is in the
form of a regular expression, or a regular
grammar, construct an equivalent FA. - If there is a simple path from the initial vertex
to any final vertex, then the language is not
empty. - Find all the vertices that are the base of a
cycle. If any of these vertices is on a path
from the initial to a final vertex, the language
is infinite otherwise, it is finite.
21The does L1 L2 question
- Theorem 4.7
- Given standard representations of two regular
languages L1 and L2, there exists an algorithm
for determining whether or not L1 L2.
22The does L1 L2 question
- Proof Here is the algorithm
- Define a new language
- L3 (L1 ? L2) ? (L1 ? L2)
- L3 is regular (see previous closure proofs)
- Therefore, we can find a DFA that accepts L3.
- Use theorem 4.6 to decide if L3 is empty.
- L3 ? iff L1 L2 (exercise 8 in section 1.1 in
the Linz textbook). - So L1 L2 if L3 ? otherwise, L1 ? L2
23The pigeonhole principle
- The pigeonhole principle states that if n 1
items are placed into n pigeonholes, then at
least 1 pigeonhole must end up with more than 1
item in it. - In set notation
- if f A ? B
- A n 1
- B n
- then f cannot be one-to-one
24Not all formal languages are regular
An automaton that accepts the language L
akbk k? 0 must count the number of as in
each string to make sure there is an identical
number of bs. There is no limit on how high the
automaton might need to count to accept a string
in this language. But an automaton with finite
memory can only count as high as the size of its
memory.
25Not all formal languages are regular
This is an intuitive argument why this language
is not regular. It is not a proof, however. To
prove that a language is not regular, we use a
mathematical result called the pumping lemma for
regular languages.
26Theorem 4.8 The Pumping Lemma
- The Pumping Lemma is used to prove that a
language is not regular - How do we prove that a language L is regular?
- Write a regular expression for it
- Draw a Finite Automaton for it
- Construct a regular grammar for it
27Pumping Lemma
Theorem 4.8 Let L be a regular language. There
exists a positive integer m such that for any
string w ? L with w ? m, w may be written as w
xyz, for some x, y, and z satisfying the
following xy ? m, y ? 1, and xyiz ? L for
every i ? 0
28Pumping Lemma
In other words, every sufficiently long string in
L can be broken down into three parts in such a
way that an arbitrary number of repetitions of
the middle part yields another string in L. We
say that the middle string is pumped, hence the
term pumping lemma.
29Based on the idea of loops
- Given
- M (Q, S , d ,q0,A), where Q n, and
- any string x where x ? n , then x must pass
through a sequence of n 1 states. - Suppose x a1 a2 a3 ... an y. Then the sequence
of n1 states - q0 d(q0, l)
- q1 d(q0, a1)
- q2 d(q0, a1 a2)
- qn d(q0, a1 ...an)
- must contain some state at least twice, by the
pigeonhole principle.
30Example
b
a
q0
q1
x a x 1 Sequence of states q0 q1 n
Number of different states passed through 2
31Example
b
a
q0
q1
x bba so x 3 Sequence of states q0 q0
q1 n 2 Any string where x ? n must have
repeated a state!
32Pumping
- If a state is repeated one or more times, it
means that there must be a loop in the transition
diagram. - If there is a loop, then it can be pumped to
produce additional strings that belong to the
language
33Example
- If ba is in the language, and there are only 2
states in the automaton, then a, bba, bbba,
bbbba, etc. are also in the language.
b
a
q0
q1
34Example of a nonregular language
- L 0i1i i ? 0
- Is this regular?
- No.
- Why not?
- Intuitively We cant build a finite automaton
to recognize it. - Why not?
35Example of a nonregular language
- L 0i1i i ? 0
- Because the FA has no memory for past events
except its states. Each state can tell you how
you got to that state from the immediately
previous state (i.e., the last character you
processed), but, if there is a loop, it cant
remember the number of characters you processed
up to that point.
36Limits of a FA
- Being in state q1 and having just read a 1
doesnt tell you anything about how many 1s have
already been processed. The FA simply doesnt
have the memory needed to retain this information.
0
1
l
q0
q1
37Limits of a FA
- Moreover, if you have a loop like this in an FA,
the FA must accept any number of 1s in the loop.
There is no way to specify exactly as many 1s
as 0s this FA can accept 000111, but must
also accept 0111, 00001, etc.
0
1
l
q0
q1
38Limits of an FA
- Consequently, we cant build an FA that can tell
whether the number of 0s that it saw at the
beginning of the string exactly matches the
number of 1s at the end of the string. - But this is not a formal proof.
39Proof idea
If a DFA has n states, then any path of length n
must visit n1 states, and contains a cycle.
(This is an application of the pigeonhole
principle.)
y
z
x
This part of the string can be pumped to
produce other strings in the language.
40Proof idea again
- If an infinite language is regular, it is
accepted by a DFA. - The DFA has some finite number of states, m.
- Because the language is infinite, some strings
must have length gt m. - For a string of length gt m accepted by the DFA, a
walk through the DFA must contain a cycle. - Repeating the cycle an arbitrary number of times
must yield another string accepted by the DFA.
41Proof
- Suppose that qi qip , where
- 0 ? i lt i p ? n
- x uvw
- u a1a2ai
- v ai1a2aip
- w aip1aip2any
- y part of string longer than n 1
- Remember that qi qip
42Proof (cont.)
- Assume a DFA with states labeled q0,q1,qn
- Now take a string in L w ? m n 1
- To process w the machine could go through a set
of states say, - q0, qi, qj, qf.
- Since this sequence has exactly w 1 entries,
at least one state must be repeated, and this
repetition starts no later than the nth move.
43Proof (cont.)
- So the sequence of states must look like
- q0, qi, qj, , qr, qr, , qf
- indicating there must be substrings x, y, z of w
such that - d(q0, x) qr
- d(qr, y) qr
- d(qr, z) qf
- with xy ? n 1 m and y ? 1
44Proof (cont.)
- From this it immediately follows that
- d(q0, xz) qf
- as well as
- d(q0, xy2z) qf,
- d(q0, xy3z) qf,
- and so on, completing the proof of the theorem
45How to use the pumping lemma
The Pumping Lemma describes a property that is
possessed by every regular language. If we show
that a language does not possess this property,
we know that it is not regular. The strategy is
proof by contradiction. We assume a language has
the property described by the pumping lemma, and
then we show that this leads to a contradiction.
It follows that the language is not regular.
46Example
- Example 4.7 The language L anbn n ? 0 is
not regular. - The proof is by contradiction
- If L is regular, it must be accepted by some
DFA. - Let m be the number of states of the DFA and
consider some w ? L such that w ? m. - By the pumping lemma, we can split w into three
pieces, w xyz, such that for any n ? 0, the
string xynz is in L. - So let w ambm.
- Because xy ? m, y must consist of all as.
- But then xy2z will contain more as than bs.
- This is a contradiction.
47Example
- Use the pumping lemma for regular languages to
argue that the language Lww, w ? a, b is
not regular - Assume that L is a regular language.
- If L is regular, then there exists a DFA that
accepts the strings in L. - Let m be the number the number of states in
this DFA. If we have a string w ? L where w
m, then the pumping lemma for regular languages
tells us that we can divide w into three
substrings w xyz, where xy m and y 1,
such that xyiz ? L for all i 0.
48Example
- Let us choose w ambamb which is a string in
L. - Since xy m, y must consist entirely of
as. - However, pumping this part to form string t
xy2z would produce more as in the first half
of the string (before the first b) than in the
second half. This would mean that t ? L. Yet, the
pumping lemma for regular languages tells us that
if L is regular then t must be in L. - This is a contradiction and implies that our
initial assumptionthat L is a regular
languagemust be false. Therefore, L is not a
regular language.
49Palindromes
- We know that the language of palindromes, PAL,
is not regular. Why? - For any two strings x and y, a string z can be
found which distinguishes them. - For x, y which are different strings and xy,
if z xreverse is appended to each, then xz is
accepted and yz is not accepted - Therefore there must be an infinite number of
states in any FA accepting PAL, so PAL is not
regular.
50Homework
Use the pumping lemma to show that the language
of palindromes L w w wR, w ? a, b
is not regular.
51Homework
Use the pumping lemma (plus some closure
properties of regular languages) to show that
the language L w ? a,b w contains an
equal number of as and bs is not regular.
52Homework
Use the pumping lemma to show that the language
L ww w ? a,b is not regular.