Title: Automata, Grammars and Languages
1Automata, Grammars and Languages
- Discourse 03
- Finite Automata
2Finite Automata / Switching Theory(CS) / (CE)
- Boolean operators / Gates (Elem. Switching Ops)
3Boolean Functions / Combinatorial Circuits
?
?
?
?
H half adder
H
H
?
F full adder
4Boolean Functions / Comb. Circuits (contd)
5Boolean Functions / Comb. Circuits (contd)
- Equations representing F
- General scheme (n inputs, m outputs)
6Finite Automata / Sequential Circuits
- Add memory elements delay elements
- Finite of delay elements possible ? ? d
f
combinatorial circuit
7Finite Automata / Sequential Circuits
- Ex sequential adder add 2 binary numbers low
- order bits received first
-
- (a) sequential net (circuit)
1001 0101 1110
carry
8Finite Automata / Sequential Circuits
- (b) Next-State Output Equations
- (c) Transition Table state space
next state/output table
9Finite Automata / Sequential Circuits
10Finite Automata / Sequential Circuits
- (e) Finite-State Transducer (Mealy Machine)
- A 5-tuple where
finite set of states
start state
input alphabet
output alphabet
transition/output function
11General Sequential Network
?
12Three Types of Automata
M
time
transducer
M
Yes(1)
No(0)
recognizer (acceptor)
M
Enumerator (generator)
13Machines that Recognize
- Detection of an event, i.e., a pattern in input
- Recognition of just those words in some language
L - Definition of a language
- Ex detect abab all non-overlapping occurrences
b
b
a
a
b
a
14Ex C Comments / /
- Filter in the lexical scanner
- Recognizer
empty
(transducer)
notation in out
15Finite Automaton (Finite State Machine, FSA)
- Defn 1.5 A (deterministic) finite automaton is
a 5-tuple -
- is a finite set, the states
- is a finite set, the alphabet
- is
the transition function - is the start state
- is the set of accepting
(final) states - Ex
16How FA Compute
- FA
is a finite structurelike a programfixed and
static - Need to define the behavior of M on input w
- Sequence of configurations
- Like trace of a program on given data
- Dynamic and input-dependent
- Ex start on input
Look at sequence of moves determined by the
transition function - Since in accepting state when input
exhausted, w is recognized by
17How FA Compute (contd)
- Given a FA
- Defn configuration of M is an element of
- Defn yields in one step (or moves) relation
between configurations is defined by - where
- Notes is a function, since ? is.
Is undefined. - Defn yields is the relation
- Means moves in zero or more steps to
-
- Defn A string w is recognized (accepted) by M ?
18How FA Compute (contd)
- Defn The language recognized (accepted) by M
is - Defn 1.16 A language S is regular iff there is
some FA that recognizes it, i.e., - Ex In FA
19Example
make change for i 30 vend coffee
- Coin checker for 30 coffee. ?n,d,q
d
n
0
5
50
d
n
q
10
q
45
q
n
d
q
15
40
q
n
20
q
35
25
n
d
30
n
d
d
20Regular Operations Regular Expressions
- The regular operations on languages are
- union (?), concatenation (? ) and Kleene star
(). - So called because the class of regular languages
are closed under themi.e., applying these
operators to regular languages results in a
regular language. (We will prove these closure
results later.) - In fact, these three operations (?, ? ,)
actually characterize what it means to be a
regular language any regular language can be
built up from alphabet symbols and a finite
number of these regular operations. - This motivates the notion of regular expression
a sequence of symbols, like an arithmetic
expression, that defines a regular language using
regular ops.
21Regular Expressions
- A syntax for describing sets of strings
(languages) - Terse
- Eliminates fussy
- Reminiscent of arithmetic expressions
- Obeys some useful algebra, e.g., (EF)
(EF) - Syntax for regular expressions over ?,, ? ,,(,)
- E ? (EE) ( text uses ? not some authors
use ) - ? (EE) (usually suppress the ? in E ? E)
- ? (E)
- ? ? (some authors use ? )
- ? ?
- ? a for each a in ?
- suppress (,) where possible (ab)a not ( ( (a
b) ) ? a )
22Regular Expressions (contd)
- Meaning rules for the syntax
- The meaning (denotation) of an expression, L(E),
is a set of strings (a language) - Rules
- expression E
language L(E) - ?
- a
a - ?
? - (EF)
L(E)?L(F) - (EF)
L(E)?L(F) - (E) L(E)
23Reg. Expr. Examples, Equivalence()
- (ab)a
- (ab)
- (ab)
- (ab)a(ab)a(ab)
- (bababab)
- PASCAL unsigned numbers. d0,1,,9
- dd(? .dd)(?E(??)dd)
- a?ab ?bab
- Defn EF ?L(E) L(F)
- ?? (EF)(EF) E??E?
- E(FG)(EF)G E(FG)EFEG E??EE
L
24Nondeterminism
- Real computing devices are deterministic the
current configuration and instruction determines
the next configuration. The relation is a
function. - Why the concept of nondeterminism?
- Provides powerful, economical descriptive ability
- Provides a way to specify languages without
over-specifying and complex handling of cases - Can be algorithmically converted to a
deterministic description (at the sacrifice of
some economy and with added complexity) - Generalization of determinism
- Ex abab occurs somewhere in w abab
a
b
a
b
a,b
a,b
25Nondeterminism (contd)
- Ex w?? has penultimate symbol b w b?
- Ex w?? has ? 2 as w a a
b
b
a,b
a
26?-Moves Can Be Useful
- SNOBOL arithmetic constants (no floating E)
- Use to specify optional characters like Unix
command line opt
d
d
?
d
?
?
?
?
d
ddigit
27Nondeterministic Finite Automaton
- Defn 1.5 A nondeterministic finite automaton
is a 5-tuple -
- is a finite set, the states
- is a finite set, the alphabet
-
transition function - is the start state
- is the set of accepting
(final) states - Ex
28DFA vs NFA
- DFA ?
- For each state q and input symbol a, there is
exactly one choice of new state (or no transition
is defined at all). Each transition consumes an
input symbol - Special case of NFA!
- NFA ?
- There may be multiple choices for the same input
symbol - There may be ?-moves that do not consume an
input character - There can be chains of ?-moves
- ?-moves can create even more choice for the next
input character
29How NFA Compute
- Given a NFA
- Defn configuration
- Defn yields in one step (or moves) relation
between configurations -
- Defn yields
- Means COULD move in zero or more steps to
- Defn w is recognized (accepted) by M ?
- Same as before, but has the meaning if there
exists some sequence of moves from the start
config to some accepting config
(?-move)
30How NFA Compute (contd)
- Defn The language recognized (accepted) by M
is - Ex In NFA
- This provides no evidence that aabbba is
accepted (or not) - However, also via a separate computation
sequence - And so aabbba is recognized!
31Tree of Computations
- accepting
- Computation
- ? w?L(M1)
X
null evidence
32Computation Tree Example
a,b
a
- Ex L w w begins ends same
2
a
3
1
b
b
4
a,b
X
X
3?F
reject abab ?? path to F
X
accept ababa ? some path to F
33Example with ?-Moves
- String length a multiple of 2 or 3
? ?-moves
X
4?F ? Accept aaa
34Example with ?-Moves
?
0
1
X
b
a
X
1?F ? Accept aab
?-moves consume no input symbols
35Equivalence of NFA to DFA
- There is an algorithm to convert any NFA into a
DFA - We show basic idea assuming NFA has no ?-moves
- Then modify the construction for NFAs with
?-moves - Ex L x last symbol of x appeared previously
?a,b - Idea given input string, keep track of all
possible reached states after reading each
letter. At end of input, see if a final state is
among those reached
N0
36Equivalence of NFA and DFA (contd)
- Computation paths through NFA N0 on w abba
a
b
b
a
p
p
p
p
p
a
q
b
a
r
r
b
b
a
r
r
r
b
s
a
b
b
a
q
q
q
q
a
s
37Equivalence of NFA and DFA (contd)
- Idea keep a list of all possible states
reachable by each prefix of w (parallel
worlds). For NFA N0
38Equivalence of NFA and DFA (contd)
- Equivalent DFA M will have
- State set P (Q)
- Alphabet ?
- Start state set q0
- Accepting states X ? Q X ? F ? ?
- Deterministic transition function ?? P (Q) ? ?
? P (Q) - Ex For NFA N0
39Equivalence of NFA and DFA (contd)
- Thm Rabin-Scott Construction. Let L L(N)
for some NFA N with no ?-moves.. There is an
algorithm to constuct a DFA M equivalent to N,
i.e. with L(M) L(N). - Pf Given N we construct a DFA M and then
verify that it recognizes the same set as N. - Construction Given NFA
construct -
where
40Equivalence of NFA and DFA (contd)
a
a
a
b
b
a
41Equivalence of NFA and DFA (contd)
- Verification Show (1) M is a DFA and (2)
L(M)L(N) - (1) ?? is a function by the construction, and
Q? is finite - Q? 2Q . So M is a DFA.
- (2) To show equivalence we prove the
- Lemma
-
- Pf By induction on the length of the input
string w. - Base w0.
- Step Suppose (IH) the lemma is true
- Let
To show
42Equivalence of NFA and DFA (contd)
- ?. Assume Then ?
state r with -
and - Then By (IH)
-
() - By construction of M
Let - Then
- Using this with () results in
- So
43Equivalence of NFA and DFA (contd)
- ?. Assume
- Then ? state R with
So -
(1) - By construction
and
-
(2) - Since we have
from (IH) -
(3)
- Combining (2) (3)
- So
44Equivalence of NFA and DFA (contd)
- We now finish the verification proof. Let
- From the Lemma
-
- That is, for some
- for
some
45Example ?-Free NFA ? DFA
- Consider the previous NFA
46Conversion NFA ? ?-free NFA
- ?-closure of a state or set of states E(R)
- Before picture
For R ? Q the ?-closure of R is
47Conversion NFA ? ?-free NFA
- Coalesce all nodes reachable from 1 by ?-moves
- After picture
10
c
9
a
11
b
1,2,3,4,5,6,7,8
d
12
a
b
a
13
14
Note still an NFA
15
E(9)
a
E(1)
c
E(10)
b
Etc.
48Conversion NFA ? ?-free NFA
- Thm There is an algorithm to convert any NFA
into an equivalent NFA with no ?-moves. - Pf Construction Given NFA
construct new NFA
with - Lemma
- Pf Induction on the length of w. ?
- Verification. From the Lemma
-
49Conversion NFA ? DFA
- Thm 1.39 Rabin-Scott Theorem There is an
algorithm to convert any NFA into an equivalent
DFA. - Pf Given NFA N, convert it to an ?-free NFA
N?. Using the Rabin-Scott construction, convert
N? to an equivalent DFA M. ?? - Corollary 1.40 A language is regular ? some NFA
recognizes it. - Ex Start with an NFA N1 as follows
50Conversion NFA ? DFA (contd)
- Remove ?-moves
- Apply Rabin-Scott construction to get DFA
b
b
b
2 3 1
?
b
3 1
1
4
d
b
A
D
C
B
51Regular Expression ? NFA
- Thm 1.55 There is an algorithm that, given a
regular expression E, constructs a NFA N such
that L(E) L(M). - Pf Induction on the of operator symbols in
E. - Base E ? ?
a?? - Step Assume (IH) the result is true of all
expressions with ? operator symbols (,?,).
Let E have k1 ops. - Three cases
- Case E (E1E2). By IH, ? FA M1 , M2 with
L(E1) L(M1) and L(E2) L(M2). Construct
the following NFA M.
a
52Case
?
?
53Regular Expression ? NFA (contd)
- Case E (E1?E2). By IH, ? FA M1 , M2 with L(E1)
L(M1) and L(E2) L(M2). Construct the
following NFA M.
54Case ?
Unmark final states in M1
?
?
55Regular Expression ? NFA (contd)
- Case E (E1). By IH, ? FA M1 with L(E1)
L(M1). Construct the following NFA M.
56Case
s
?
?
?
QED
57Example Reg. Exp.?NFA
?
b
?
Not very economical
?
?
a
a
?
?
58Regular ExpressionsApplications
- Regexp used in various development tools
- qed interactive text editor. 1st version
Lampson Deutsch 1967 - Regexp added by Ken Thompson, Bell Labs, ca. 1968
- Regexp compiled into NFA in machine code
- Rabin-Scott idea used to scan on the fly
- One of the first software patents
- Offspring ed by Ken for Unix
- Many others followed em, vi / ex, sam, qedx,
- grep, egrep - pattern search in a file
- shell command line interpreter
- lex lexical analyzer generator
- sed non-interactive stream editor
- awk pattern scanning and processing language
- perl pattern-driven programming language
59Applications (contd)
- Regular expressions patterns
- meaning awk regexp
- matches gt1 r r
- matches gt0 r r
- matches 0 or 1 r r?
- matches r then s rs
- matches r or s rs
- match literal c \c
- match begin/end line
- match any char .
- group exprs (s)
- character list abc
- negated char list abc
60Applications--Examples
- -?0-9 nonempty digit strings, optional sign
- 0-9 any char except digit
- \.\ reference citations in a paper
- g/ /d delete blank lines
- g/ /d delete lines with a blank
- Ex match is always (1) leftmost and (2) longest
- file abcddddef vi s/d/x/ ?
xabcddddef - s/d/x/ ? abcxef
- Ex csh sort roll1-5 egrep C SCMATH pr
61Applications--Examples
- Ex traditional spelling mnemonic
- i before e, except after c,
- or when pronounced a,
- as in neighbor and weigh
- --except for weird examples.
- grep cei /usr/share/dict/words gt foo
- cat foo
- abseil Aeneid ageing Alamein albeit atheist
Boeing Budweiser caffein canoeist deice deictic
dilettanteism dreidl ... - if you think this spelling rule is sufficient,
you will be deficient, inefficient, unscientific
and far from omniscient
62Applications--Examples
- Ex lex generates a lexical analyzer yylex().
Example wordcount (wc) -
- int nchar, nword, nline
-
-
- \n nline nchar
- \t\n nword, nchar yyleng
- // yyleng length of matched string
- . nchar
-
- int main(void)
- yylex() // invoke generated lexer
- printf("d\td\td\n", nchar, nword, nline)
- return 0
-
63L Regular ? L Denoted by a Reg. Expr.
- Weve defined regular as meaning recognized
by a DFA (equiv. to rec. by an NFA) - This equivalence result is known as Kleenes
Theorem - Weve already shown the ? directionwe
constructed an NFA from a regular expression
(Using Rabin-Scott we could convert this NFA to a
DFA.) - Now we show the ? direction given a DFA M
construct a regular expr. E with L(M) L(E). - Thm (Kleene) There is an algorithm that, given
a DFA M , computes a regular expression E such
that L(M) L(E). - Pf Given the graph of the DFA, use the node
elimination algorithm to gradually eliminate
all nodes in favor of expressions on the edges of
the graph.
64L Regular ? L Denoted by a Reg. Expr.
- Weve defined regular as meaning recognized
by a DFA (equiv. to rec. by an NFA) - This equivalence result is known as Kleenes
Theorem - Weve already shown the ? directionwe
constructed an NFA from a regular expression
(Using Rabin-Scott we could convert this NFA to a
DFA.) - Now we show the ? direction given a DFA M
construct a regular expr. E with L(M) L(E). - Thm There is an algorithm that, given a DFA M ,
computes a regular expression E such that L(M)
L(E). - Pf Let
be a DFA. If we can choose
and
65Kleenes Theorem (contd)
- Define for k 1 ,, n1
- is the set of strings x labeling the
(deterministic) computation paths from node i to
node j that use only those intermediate nodes in
1,2, , k 1 - So uses no intermediate nodes and
represents all possible path labels from i to
j. Picture
x
i
j
66Kleenes Theorem (contd)
- Lemma Each has a regular expression
denoting it. - Pf of Lemma By induction on k.
- Base k 1
- and these have reg. exprs. of form
or - .
- Step Assume the Lemma is true for k (IH).
Consider - The new node in use is node k. All paths
in - either do not enter k or else go thru k
one or more times. - Picture
67Kleenes Theorem (contd)
- All paths in either do not enter k or
else go thru k one or more times. So
68Kleenes Theorem (contd)
- By IH (and closure properties) this is
regular. - To finish the proof of the theorem, observe
that - and so by closure properties, L(M) is
regular. - Note Sipser text uses a different
algorithm, employing the concept of generalized
NFANFA that have regular expressions on their
edges. See Examples 1.66 and 1.67.
69Ex Kleenes Theorem
a
a
2
b
70Kleenes Thm use Generalized NFA
1
1
B
B
0
1
0
1
0
1
A
0
1
0
C
?
?
S
0
C
1
B
2. Elim. C
0
1
01
1(101)000
A
00
4. Elim. A
?
3. Elim. B
?
S
A
?
?(1(101)000)?
?
S
S
Order CBA
E (1(101)000)
71Ex Node Elimination Algorithm
Add ?-moves
b
?
?
S
B
C
b
b
a
A
a
72Ex Node Elimination Algorithm (ACB)
Elim. A
b
?
?
S
B
C
A
b
baa
Elim. C
bb
b
?
S
B
AC
bbaa
Elim. B
(bbbbaa)b
ACB
S
73Ex Node Elimination other orders
b
?
?
S
B
C
b
b
a
A
a
74Ex Other elimination orders (CAB)
bb
Elim. C
b
?
S
B
C
a
bb
A
a
Elim. A
bb
b
?
CA AC (above)
S
B
bbaa
Elim. B
(bbbbaa)b
S
CABACB
75Ex Other elimination orders (CBA)
bb
Elim. C
b
?
S
B
C
a
bb
A
a
Elim. B
(bb)b
CB
S
A
(bb)bb
a(bb)b
Elim. A
a
(bb)b(bb)bbaa(bb)b
S
CBA
76Ex Other elimination orders (BAC)
bb
?
C
b
B
Elim. B
S
ab
b
A
a
BA AB
bb
Elim. A
b
?
C
S
baab
Elim. C
b(bbbaab)
S
BACABC
77Ex All elimination orders equiv (BAC ACB)
BAC
b(bbbaab)
ACB
(bbbbaa)b
Easy to prove by induction for any expression E,
b(Eb)(bE)b. Using this identity
b(bbbaab) b (bbaa) b
b (bbaa) b
(bbbbaa)b
Further regular expression simplication is
possible b(bbbaab)bb(baab)bb(?aa)b
bbab
Good exercise show results of all other
elimination orders are equivalent to these,
using regular expression algebra
78Algebra of Regular Expressions
- ? an algebra for symplifying regular expressions
- Can use this algebra to construct RegExps from FSA
rs sr (rs)tr(st) rrr r?r (rs)t
r(st) r? ?r r r? ?r ? r(st) rs
rt (rs)t rt st ? ? (r) r r r
r r ? rr (rs) (rs)
79Solving Regular Expr Equations
- Can solve linear equations with regexp
variables
X aX b a(aX b) b a2X ab b
a2 (aX b) ab b a3X a2b ab b ?
X ab Check aab b aab b (aa?)b
ab Ex
a
b
X
Y
X aX bY Y ? ? X ab
80Solving RegExp Equations (contd)
1
B
0
1
0
1
0
C
Gauss-Jordan elimination back-substitution
81Ex Node Elimination Example via Algebra
- Want B. C is accept state.
- Elim. A
- Elim B
- Elim C
- ?
82Closure Properties
- A class of languages is said to be closed under
an operation if applying that operation to
members of the class results in a language that
is again a member of the class. Example the
regular languages are closed under the operations
of union, concatenation and Kleene star. - Thm The regular languages are closed under
intersection and complementation. - Pf Complementation. Let L L(M) where
- is a
DFA. Then the FA - is also
deterministic, and -
So w leads to a non- accepting
state in ? w leads to an accepting state in - So
83Closure Properties (contd)
- Intersection. Let be regular.
By DeMorgans law
-
- Since the regular languages are closed under
complementation and union, the result follows.
84Closure Properties (contd)
- Another proof of closure under ? illustrates the
technique called cross-product construction.
See Sipser text, Theorem 1.25. - Thm The class of regular languages is closed
under the intersection operation. - Pf Assume
and -
, where the automata are
deterministic. - Construction. Construct a cross-product
machine M as follows - where the transition function is defined by
- Machine M simulates the two given machines
in parallel, - keeping each machine state in one component
of
85Closure Properties (contd)
- Verification By an easy induction on x,
can show that - Therefore, for a pair of final states
- This says that
- i.e., that
-
86Closure Properties (contd)
- Defn A homomorphism h is a function that maps
each symbol of
to a string over some alphabet ?, i.e., - The homomorphism is extended to operate on
strings character-by-character, i.e., - It is further extended to languages element-wise,
i.e., - Thm If L is regular and h is a homomorphism,
then h(L) is regular. - Pf Assume L is recognized by a DFA
87Closure Properties (contd)
- Construction Construct the machine
- where for
each transition in M - put into Mh the transition
- An easy induction establishes that
- from which it follows that
Note GNFA
a
p
q
h(a)
p
q
88What is Not Regular?
- FA have a very limited computing ability. They
cannot, for example, recognized strings of
well-nested parentheses, or well-formed
arithmetic expressions, or even the language of
strings of the form ww, having two copies of the
same substring. - How can we show some languages are not regular?
We will give a property that all regular
languages must have (called the pumping
property). Then, to show that a language L is
not regular, we argue that it lacks this pumping
property.
89Pumping Lemma
- Thm Pumping Lemma for Regular Languages.
Suppose that L is an infinite¹ regular language.
Then
¹All finite languages are regular, so only
infinite languages are of interest.
90Pumping Lemma (English)
- Thm Pumping Lemma for Regular Languages.
Suppose that L is an infinite¹ regular language.
Then there is some number p (called the pumping
length) such that - if w is any string in L with w ? p, then
- w can be factored into 3
substrings, w xyz, that - satisfy the following 3 conditions
- (i) y ? ? y is not
empty - (ii) xy ? p the prefix and
pumped part are short - (iii) for every i ? 0, xyiz ? L
pumped up and pumped down
(i 0) versions of the string must also
be in L
¹All finite languages are regular, so only
infinite languages are of interest.
91Pumping Lemma (contd)
- Pf Let be a
DFA recognizing L and let p be the number of its
states. - Let be an input
string of length n where n ? p. - Let be the
sequence of states that M enters while processing
w so that for - This state sequence
has length n1 ? p1. - Among the first p1 states of this sequence,
at least 2 must be the same state pigeonhole
principle. Call the first of these 2
and the second . Because - occurs among the first p1
places in the sequence
we have that k ? p1. Define the
following substrings of w -
92Pumping Lemma (contd)
- Picture
- From the picture, we see that there is an
accepting path from to a final
state for all the strings of the form
. Also, since
it must be that
Furthermore, so - .
rk
rj
rn1
r1
a
93Non-regular Examples
- Ex
is not regular. - Pf By contradiction. Suppose L is
regular. Then by the Pumping Lemma, - Then it follows that
- is a perfect square. This is impossible.
For suppose -
for a so large that - Then
- Hence
falls in between - perfect squaresa contradiction.
94Non-regular Examples (contd)
- Ex
is not regular. - Pf By contradiction. Suppose L is
regular. Then by closure properties of the
regular languages - is
regular. Now -
We show cannot be regular,
which provides a contradiction. - If is regular, then there are
substrings - with such that
- Case 1. y is entirely in the as. Assume it
is in the as before the 2 bs (The other
subcase is symmetric). Then -
and
But then - where
This is a contradiction. -
95Non-regular Examples (contd)
- Case 2. y contains a b. Then
has more than 2 bs, and so
This is a contradiction. - Contradictions in all cases ? contradiction
to the assumption that is regular. So
is not regular. - Ex
is not regular. See Text, Example 1.73. - Ex
- is not regular.
- Pf Suppose B is regular. Then so is
- as is
its homomorphic image -
Contradiction.
96Decision Problems
- For a property/predicate P the decision problem
for P is - Given x
- Question Is P(x) true?
- Ex Given DFA M, is it true that L(M) ??
- Thm For DFA M all the following decision
problems are solvable, i.e. there exists an
algorithm to decide the question for any input - Given Question
- M,w w ? L(M) ?
- M L(M) ? ?
- M
L(M) ? ? - M, M? L(M) ? L(M? ) ?
- M, M? L(M) L(M? ) ?
-
97Decision Problems (contd)
- Pf Assume given DFAs for inputs.
- (1) Trace w through M. Yes if leads from s
to some q ? F - (2) Yes if there is some q ? F reachable from
s - (3) Convert
- (4)
-
- (5) Use (4) twice