Class%207:%20Hidden%20Markov%20Models

About This Presentation

Title:

Class%207:%20Hidden%20Markov%20Models

Description:

This means that the order of elements in the sequence did not ... P(Hi 1=k |Hi=l ) = Alk. Observations X1,...,Xn. Assumption: Xi depends only on hidden state Hi ... – PowerPoint PPT presentation

Number of Views:23

Avg rating:3.0/5.0

Slides: 34

Provided by: NirFri

Category:

more less

Transcript and Presenter's Notes

Title: Class%207:%20Hidden%20Markov%20Models

1
Class 7Hidden Markov Models
2
Sequence Models

So far we examined several probabilistic models
sequence models
These model, however, assumed that positions are
independent
This means that the order of elements in the
sequence did not play a role
In this class we learn about probabilistic models
of sequences

3
Probability of Sequences

Fix an alphabet ?
Let X1,,Xn be a sequence of random variables
over ?
We want to model P(X1,,Xn)

4
Markov Chains

Assumption
Xi1 is independent of the past once we know Xi
This allows us to write

5
Markov Chains (cont)

Assumption
P(Xi1Xi) is the same for all i
Notation P(Xi1b Xia ) Aab
By specifying the matrix A and initial
probabilities, we define P(X1,,Xn)
To avoid the special case of P(X1), we can use a
special start state, and denote P(X1 a) Asa

6
Example CpG islands

In human genome, CpG dinucleotides are relatively
rare
CpG pairs undergo a process called methylation
that modifies the C nucleotide
A methylated C can (with relatively high chance)
mutate to a T
Promotor regions are CpG rich
These regions are not methylated, and thus mutate
less often
These are called CpG islands

7
CpG Islands

We construct Markov chain for CpG rich and poor
regions
Using maximum likelihood estimates from 60K
nucleotide, we get two models

8
Ratio Test for CpG islands

Given a sequence X1,,Xn we compute the
likelihood ratio

9
Empirical Evalation
10
Finding CpG islands

Simple Minded approach
Pick a window of size N (N 100, for example)
Compute log-ratio for the sequence in the window,
and classify based on that
Problems
How do we select N?
What do we do when the window intersects the
boundary of a CpG island?

11
Alternative Approach

Build a model that include states and -
states
A state remembers last nucleotide and the type
of region
A transition from a - state to a describes a
start of CpG island

12
Hidden Markov Models

Two components
A Markov chain of hidden states H1,,Hn with L
values
P(Hi1k Hil ) Alk
Observations X1,,Xn
Assumption Xi depends only on hidden state Hi
P(Xia Hik ) Bka

13
Semantics
14
Example Dishonest Casino
15
Computing Most Probable Sequence

Given x1,,xn
Output h1,,hn such that

Idea
If we know the value of hi, then the most
probable sequence on i1,,n does not depend on
observations before time i
Let Vi(l) be the probability of the best sequence
h1,,hi such that hi l

17
Dynamic Programming Rule

18
Viterbi Algorithm

Set V0(0) 1, V0(l) 0 for l gt 0
for i 1, , n
for l 1,,L
set
Let hn argmaxl Vn(l)
for i n-1,,1
set hi Pi1(hi1)

19
Viterbi Algorithm Example
20
Computing Probabilities

Given x1,,xn
Output P(x1,,xn )
How do we sum exponential number of hidden
sequences?

21
Forward Algorithm

Perform dynamic programming on sequences
Let fi(l) P(x1,,xi,Hil)
Recursion rule
Conclusion

22
Computing Posteriors

How do we compute P(Hi x1,,xn) ?

23
Backward Algorithm

Perform dynamic programming on sequences
Let bi(l) P(xi1,,xnHil)
Recursion rule
Conclusion

24
Computing Posteriors

How do we compute P(Hi x1,,xn) ?

25
Dishonest Casino (again)

Computing posterior probabilities for fair at
each point in a long sequence

26
Learning

Given a sequence x1,,xn, h1,,hn
How do we learn Akl and Bka ?
We want to find parameters that maximize the
likelihood P(x1,,xn, h1,,hn)
We simply count
Nkl - number of times hik hi1l
Nka - number of times hik xi a

27
Learning

Given only sequence x1,,xn
How do we learn Akl and Bka ?
We want to find parameters that maximize the
likelihood P(x1,,xn)
Problem
Counts are inaccessible since we do not observe hi

If we have Akl and Bka we can compute

29
Expected Counts

We can compute expected number of times hik
hi1l
Similarly

30
Expectation Maximization (EM)

Choose Akl and Bka
E-step
Compute expected counts ENkl, ENka
M-Step
Restimate
Reiterate

31
EM - basic properties

P(x1,,xn Akl, Bka) ? P(x1,,xn Akl, Bka)
Likelihood grows in each iteration
If P(x1,,xn Akl, Bka) P(x1,,xn Akl,
Bka)then Akl, Bka is a stationary point of the
likelihood
either a local maxima, minima, or saddle point

32
Complexity of E-step