Title: ICS611 set 7
1ICS611 set 7
- Converting NFAs to DFAs
- How Lex is constructed
2Converting a nfa to a dfa
Defn The e-closure of a state is the set of all
states, including S itself, that you can get to
via e-transitions. The e-closure of state S is
denoted
3Converting a nfa to a dfa (Cont.)
Example
The e-closure of state 1 1, 2, 4 The
e-closure of state 3 3, 2, 4 Defn The
e-closure of a set of states S1, ... Sn is
S1 È S2 È ... È Sn. Example The e-closure
for above states 1 and 3 is 1,
2, 4 È 3, 2, 4 1, 2, 3, 4
4To construct a dfa from a nfa
Step 1 Let the start state of the dfa be formed
from the e-closure of the start state of the
nfa. Subsequent steps If S is any state that
you have previously constructed for the dfa and
it is formed from say states t1, ... , tr of the
nfa, then for any symbol x for which at least one
of the states t1, ... , tr has a x-successor,
the x-successor of S is the e-closure of the
x-successors of t1, ... , tr. Any state
of the dfa which is formed from an accepting
state, among others, of the nfa becomes an
accepting state.
5To construct a dfa from a nfa (Cont.1)
Example 1 To convert the following nfa
b
5
we get
This constructs a dfa that has no
epsilon-transitions.
6To construct a dfa from a nfa (Cont.2)
Example 2 To convert the nfa for an identifier
to a dfa
7To construct a dfa from a nfa (Cont.3)
we get
8Minimizing the Number of States in a DFA
Step 1 Start with two sets of states
(a) all the accepting states, and
(b) all the non-accepting states Subsequent
steps Given the sets of states S1, ... Sr
consider each set S and each symbol x in turn.
If any member of S has a x-successor and this
x-successor is in say S', then unless all the
members of S have x-successors that are in S',
split up S into those members whose x-successors
are in S' and the others (which don't have
x-successors in S').
9Minimizing the Number of States in a DFA(Cont.1)
Example 1. Consider the dfa we constructed for an
identifier (with renumbered
states)
10Minimizing the Number of States in a DFA(Cont.2)
The sets of states for this dfa are
S1 S2 Nonaccepting
states Accepting states
1 2
3
4 All states
in S2 have the successors letter-successor and
digit-successor, and the successor states are all
in the set of states S2. Combine all the states
of S2 to get
11Minimizing the Number of States in a DFA(Cont.3)
Example 2. Consider the dfa
All of the states (1, 2, and 3) are accepting
states and all their successors are also
accepting states, but state 1 has an a-successor
whereas states 2 and 3 do not.
12Minimizing the Number of States in a DFA(Cont.4)
So, we split the set of accepting states into two
sets S1 and S2 where S1 consists of state 1,
and S2 consists of states 2, 3 to get
13HOW LEX WORKS
- Using the methods described above, Lex constructs
a mimimized finite automata for each regular
expression in the definition file. - Lex generates a C program, which we will refer to
as lex.yy.c -
- The finite automatas are represented in lex.yy.c
by a set of arrays.
14- For instance, a portion of a finite automata such
as - .
7
4
can be represented by entering. in the associated
array, a 7 in the column for at row 4.
15- lex.yy.c keeps track of the latest accepting
state it has reached in any of the finite
automatas, plus the number of source characters
it has read at that point. - When it reaches a stage that no transition exists
for the next source symbol from any of the states
it has reached in any of the finite automatas, it
picks the regular expression corresponding to the
finite automata in which this last accepting
state occurs, and it pushes back onto the
remaining input any source characters read after
reaching that state. - t
16Consider, for example, a Lex defn. file
containing digit(. digit)?
return Number digit(.
digit)?edigit return Float Finite
automata corresponding to the above res are
.
digit
digit
digit
1
1
2
3
4
dfa for Number
digit
digit
.
e
digit
digit
dfa for Float
digit
digit
e
2
1
1
3
4
4
4
5
6
digit
digit
digit
digit
17- Example let the remaining input be
36e8X1 - On reading the 3, lex.yy.c records that the
latest accepting state encountered is state 2 in
the dfa for Number, and the no. of source
characters read is 1. (It has also reached state
2 in the dfa for Float). - On reading the 6, lex.yy.c records the above
again, except that the no. of characters read is
2. - On reading the 8, lex.yy.c records that the
latest accepting state is state 6 in the dfa for
Float, and no. of characters read is 4. - On reading the , lex.yy.c finds that state 6
has no successor. This is the 5th character
read. So the last accepting state (state 6) is
in the dfa for Float after 4 characters had been
read. Hence Float is taken as matching the
remaining input, and the 5th character read, i.e
the , is pushed back onto the remaining
input.