Hidden%20Markov%20Models%20(HMM)%20Rabiner

About This Presentation

Title:

Hidden%20Markov%20Models%20(HMM)%20Rabiner

Description:

A Tutorial on Hidden Markov Models and Selected Applications in ... Casino Coin Properties of an HMM First-order Markov process qt only depends on qt-1 ... – PowerPoint PPT presentation

Number of Views:678

Avg rating:3.0/5.0

Slides: 30

Provided by: Fatih4

Learn more at: https://rakaposhi.eas.asu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Hidden%20Markov%20Models%20(HMM)%20Rabiner

1
Hidden Markov Models (HMM)Rabiners Paper

Markoviana Reading Group
Computer Eng. Science Dept.
Arizona State University

2
Stationary and Non-stationary

Stationary ProcessIts statistical properties do
not vary with time
Non-stationary ProcessThe signal properties
vary over time

3
HMM Example - Casino Coin
0.9
Two CDF tables
0.2
0.1
Fair
Unfair
State transition Pbbties.
States
0.8
Symbol emission Pbbties.
0.5
0.3
0.5
0.7
Observation Symbols
H
H
T
T
Observation Sequence
HTHHTTHHHTHTHTHHTHHHHHHTHTHH
State Sequence
FFFFFFUUUFFFFFFUUUUUUUFFFFFF
Motivation Given a sequence of H Ts, can you
tell at what times the casino cheated?
4
Properties of an HMM

First-order Markov process
qt only depends on qt-1
Time is discrete

5
Elements of an HMM

N, the number of States
M, the number of Symbols
States S1, S2, SN
Observation Symbols O1, O2, OM
l, the Probability Distributions a, b, p

6
HMM Basic Problems

Given an observation sequence OO1O2O3OT and l,
find P(Ol)
Forward Algorithm / Backward Algorithm
Given OO1O2O3OT and l, find most likely state
sequence Qq1q2qT
Viterbi Algorithm
Given OO1O2O3OT and l, re-estimate l so that
P(Ol) is higher than it is now
Baum-Welch Re-estimation

7
Forward Algorithm Illustration
at(i) is the probability of observing a partial
sequence O1O2O3Ot such that the state Si.
8
Forward Algorithm Illustration (contd)
at(i) is the probability of observing a partial
sequence O1O2O3Ot such that the state Si.
Total of this column gives solution
State Sj SN pNbN(O1) S (a1(i) aiN) bN(O2)
State Sj
State Sj S6 p6b6(O1) S (a1(i) ai6) b6(O2)
State Sj S5 p5b5(O1) S (a1(i) ai5) b5(O2)
State Sj S4 p4b4(O1) S (a1(i) ai4) b4(O2)
State Sj S3 p3b3(O1) S (a1(i) ai3) b3(O2)
State Sj S2 p2b2(O1) S (a1(i) ai2) b2(O2)
State Sj S1 p1b1(O1) S (a1(i) ai1) b1(O2)
State Sj at(j) O1 O2 O3 O4 OT
Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot
9
Forward Algorithm
Definition Initialization Induction Problem
1 Answer
at(i) is the probability of observing a partial
sequence O1O2O3Ot such that the state Si.
Complexity O(N2T)
10
Backward Algorithm Illustration
?t(i) is the probability of observing a partial
sequence Ot1Ot2Ot3OT such that the state Si.
11
Backward Algorithm
Definition Initialization Induction
?t(i) is the probability of observing a partial
sequence Ot1Ot2Ot3OT such that the state Si.
12
Q2 Optimality Criterion 1

Maximize the expected number of correct
individual states
Definition
Initialization
Problem 2 Answer

?t(i) is the probability of being in state Si at
time t given the observation sequence O and the
model ?.
Problem If some aij0, the optimal state
sequence may not even be a valid state sequence.

13
Q2 Optimality Criterion 2

Find the single best state sequence (path),
i.e. maximize P(QO,?).
Definition

dt(i) is the highest probability of a state path
for the partial observation sequence O1O2O3Ot
such that the state Si.
14
Viterbi Algorithm
The major difference from the forward
algorithm Maximization instead of sum
15
Viterbi Algorithm Illustration
dt(i) is the highest probability of a state path
for the partial observation sequence O1O2O3Ot
such that the state Si.
Max of this col indicates traceback start
State Sj SN pN bN(O1) max d1(i) aiN bN(O2)
State Sj
State Sj S6 p6 b6(O1) max d1(i) ai6 b6(O2)
State Sj S5 p5 b5(O1) max d1(i) ai5 b5(O2)
State Sj S4 p4 b4(O1) max d1(i) ai4 b4(O2)
State Sj S3 p3 b3(O1) max d1(i) ai3 b3(O2)
State Sj S2 p2 b2(O1) max d1(i) ai2 b2(O2)
State Sj S1 p1 b1(O1) max d1(i) ai1 b1(O2)
State Sj dt(j) O1 O2 O3 O4 OT
Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot Observations Ot
16
Relations with DBN

Forward Function
Backward Function
Viterbi Algorithm

bj(Ot1)
aij
?t(i)
?t1(j)
bj(Ot1)
aij
?t1(j)
?t(i)
?T(i)1
?t1(j)
bj(Ot1)
aij
?t(i)
17
Some more definitions
gt(i) is the probability of being in state Si at
time t
xt(i,j) is the probability of being in state Si
at time t, and Sj at time t1
18
Baum-Welch Re-estimation

Expectation-Maximization Algorithm
Expectation

19
Baum-Welch Re-estimation (contd)

Maximization

20
Notes on the Re-estimation

If the model does not change, it means that it
has reached a local maxima.
Depending on the model, many local maxima can
exist
Re-estimated probabilities will sum to 1

21
Implementation issues

Scaling
Multiple observation sequences
Initial parameter estimation
Missing data
Choice of model size and type

22
Scaling

calculation
Recursion to calculate

23
Scaling (contd)

calculation
Desired condition
Note that is not true!

24
Scaling (contd)
25
Maximum log-likelihood

Initialization
Recursion
Termination

26
Multiple observations sequences

Problem with re-estimation

27
Initial estimates of parameters

For ? and A,
Random or uniform is sufficient
For B (discrete symbol prb.),
Good initial estimate is needed

28
Insufficient training data

Solutions
Increase the size of training data
Reduce the size of the model
Interpolate parameters using another model

29
References

L Rabiner. A Tutorial on Hidden Markov Models
and Selected Applications in Speech Recognition.
Proceedings of the IEEE 1989.
S Russell, P Norvig. Probabilistic Reasoning
Over Time. AI A Modern Approach, Ch.15, 2002
(draft).
V Borkar, K Deshmukh, S Sarawagi. Automatic
segmentation of text into structured records.
ACM SIGMOD 2001.
T Scheffer, C Decomain, S Wrobel. Active Hidden
Markov Models for Information Extraction.
Proceedings of the International Symposium on
Intelligent Data Analysis 2001.
S Ray, M Craven. Representing Sentence Structure
in Hidden Markov Models for Information
Extraction. Proceedings of the 17th
International Joint Conference on Artificial
Intelligence 2001.

Write a Comment

User Comments (0)

About PowerShow.com

Hidden%20Markov%20Models%20(HMM)%20Rabiner - PowerPoint PPT Presentation

Hidden%20Markov%20Models%20(HMM)%20Rabiner

A Tutorial on Hidden Markov Models and Selected Applications in ... Casino Coin Properties of an HMM First-order Markov process qt only depends on qt-1 ... – PowerPoint PPT presentation