Hidden Markov Models: Basics - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Hidden Markov Models: Basics

Description:

A flips first, followed by B, then A again. Representation as state ... The persons flipping the coin are hidden? Only the results of the coin flips known? ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 21
Provided by: raj5165
Category:

less

Transcript and Presenter's Notes

Title: Hidden Markov Models: Basics


1
Hidden Markov Models Basics
  • Raj Bandyopadhyay
  • 11/07/2001

2
Our running example
  • Hypothetical Nucleic Acid (HNA)
  • Two bases or residues H and T
  • Assume existence of HNA databases
  • Examples
  • HTTTHT

3
Motivating Example
  • HNA sequence database HTT, TTT, HHH, TTH
  • 1 2 3
  • H T T
  • T T T
  • H H H
  • T T H
  • P(H)P(T)0.5 P(H)0.25 P(H)P(T)0.5
  • P(T)0.75
  • Positions 1 3 are determined by an unbiased
    coin, whereas position 2 is determined by a
    biased coin.

4
Motivating Example (contd.)
  • Imagine 2 people A and B
  • A holds an unbiased coin P(H)P(T)0.5
  • B holds a biased coin P(T)0.75, P(H)0.25
  • Our HNA database can be explained
  • A flips first, followed by B, then A again
  • Representation as state diagram (Markov Chain).

5
Markov Representation
6
Hidden Markov Model
  • What if
  • The persons flipping the coin are hidden?
  • Only the results of the coin flips known?
  • I.e. Only emissions known states unknown
  • Hidden Markov Model
  • What can we infer
  • From a given HMM about the data?
  • From given data about the generating HMM?

7
Terms to understand
  • State
  • Transition Transition probability
  • Emission Emission Probability
  • Path

t110.5
t220.5
End
e2H0.25 e2T0.75
e1H0.5 e1T0.5
Start
t1E0.5
t120.5
tS11
8
Questions of interest
  • For a given HMM
  • Probability of generating a particular output
    sequence? Likelihood
  • Most probable path? Decoding
  • Adjusting probabilities in the light of observed
    sequences? Learning/training

9
Likelihood
  • Baum-Welch score Likelihood of a sequence s
    Sum of likelihoods of all paths generating s
  • -log Lo(M) -ve log likelihood used in practice
  • Two kinds of deductions required
  • Forward Given observations upto time t, to
    predict sequence state at time t1
  • Backward Given observations from time t1, to
    deduce sequence state at time t.

10
Calculating Likelihood
  • The forward recursive relation
  • Dynamic Programming Store previously calculated
    values of at(i) Forward method
  • Similarly, the backward recursive relation leads
    to the Backward method

11
Example
0.4
0.6
1
End
P(H)0.2 P(T)0.8
P(H)0.3 P(T)0.7
Start
1
1
2
Consider the sequence sHT. Paths generating
s 11, 13 LHT(M) 10.20.40.8
(10.20.6)0.7 a1(1)1 a2(1)10.20.4 a2(2)10.
20.6
12
Most Probable Path
  • Viterbi Score of HMM M w.r.t sequence O
  • Most probable path to generate O
  • Recursive method and dynamic programming
    Viterbi Algorithm
  • Let pi(t) path ending in state i at time t

13
Learning Training HMMs
  • Assume we have seen part X of a complete
    sequence Y.
  • So far, we have a model M to maximize LX(M)
    likelihood of X given M
  • We want a new model M to maximize LY(M), given
    X and M
  • This is the Baum-Welch (EM) algorithm

14
Expectation-Maximization
  • Expectation Step obtain a score for the
    goodness of a new model
  • Maximization Step set the new model as that
    with maximum goodness

15
New Baum-Welch HMM parameters
  • New transition probabilities
  • New emission probabilities

16
Gradient Descent
  • Gradient Descent (Baldi-Chauvin)
  • Define log-likelihood as an energy measure
  • Derive new model parameters so to minimize this
    energy at every step
  • Iterative greedy algorithm
  • Advantages over Baum-Welch
  • Online updates
  • Absorbing 0-probability in Baum-Welch

17
A Real Sequence HMM
Matching (main), insert and delete states
18
Implementation issues
  • Parameter initialization average,
  • uniform, random etc.
  • Priors initialized to favor transitions towards
    matching (main) states
  • Initialization from existing Multiple Alignments
  • Model length Average length of input sequences
  • Adaptable architecture?????

19
HMMs Advantages
  • Solid statistical foundation
  • Efficient learning algorithms
  • Flexible and general model for sequence
    properties
  • Unsupervised learning from variable-length, raw
    sequences

20
HMMs Disadvantages
  • Large number of unstructured parameters
    (emission and transition parameters)
  • Need large amounts of data
  • Subtle long-range correlations in real sequences
    unaccounted for, due to Markov property
Write a Comment
User Comments (0)
About PowerShow.com