Title: Minimizing the number of states in DFAs
1Minimizing the number of states in DFAs
- Paul Richardson
- Student Lecture 6 November
- Course 08.73.11 Algorithms, Logic and Complexity
- University of Iceland Department of Computer
Science
2DFA use
- DFA much used as pattern recogniser e.g.
- in compilers
- in text processing.
- Text processing tasks for DFAs use increasingly
larger data sets which usually require more
complex DFA - e.g. reference corpora (Bank of England Corpora
has 400 million words) - Smaller DFA's (fewer states) use less space and
time. - DFAs are minimised to save resources.
- Problems in 1a) and 1b) are essentially the same
processing problem.
3Compiler conversion of regular expression
input
regular expression
NFA
by Thompson's construction
NFA
DFA
by subset construction
DFA
DFAMIN
by minimisation algorithm
parser
4Thomson's Construction
generated automatically for regex (a ? b)abb
5Subset ConstructionNFA N DFA D
- Let N (Q,?,d, q0,F) and D (Q,?,d, q0,F)
-
- Let Q P(Q) Ø,1, 2, 3, 1,2, 1,3,
2,3, 1,2,3 - Create D start state as all states accessible by
? arrows 1,3 - Accept states F 1, 1,2, 1,3, 1,2,3
6DFA DFAMIN
- All minimising algorithms involve finding
equivalent states and calling them the same state
thus collapsing to fewer states. Many variations
on the theme published. - Most common algorithm shown is a partitioning
algorithm that runs at O(n2) - Hopcroft published an O(nlogn) algorithm in 1971
- Much literature deals with the study of this
algorithm and variations.
7Partitioning Algorithm O(n2)
1. Partition into blocks by final/not final
(1,2,3,4,5) (6) On input 0 2. Partition by
successor state (1,2,3,4) (5) (6) 3. Repeat
until no new blocks are generated i.e. (1,2,3)
(4) (5) (6) ... (1) (2) (3) (4) (5) (6) 4.
Combine the members of each block to form a
single state from each partition.
In this case all states are unequivalent so the
DFA is not minimised but normally there are
blocks containing more than one state and these
blocks are treated as one state.
8Partitioning Algorithm O(nlogn)Hopcroft 1971
This algorithm may need n iterations but there
are fewer actions in each iteration which give it
O(nlogn). We do not need to repeat partitioning
on a block with an input symbol until it is split
then we need only partition on one of the two sub
blocks and we can always choose the smaller of
the two, which needs less processing. Hopcroft
1. Invert state table 2. Partition by final/not
final (1,2,3,4,5) (6) 3. Select a partition and
input symbol (e.g. (6),0) 4. Partition by
condition if input 0 (6) to get (1,2,3,4)
(5) (6) (n.b. Using ((1,2,3,4,5),0) in step 3 is
equivalent 5. Iterate the partition choosing the
smallest block 6. Combine the members of each
partition to form a single state from each
partition.
T
9References
An nLogn Algorithm for Minimizing States in
kFinite Automaton, John Hopcroft, Stan-Cs-71-190
January, 1971 Sipser, Introduction to the Theory
of Computation ISBN 0-619-21764-2 Google scholar
Various numerous web sites on algorithms,
research papers and teaching material. Question
s?
10Binary dinosaur MZ80k