MARKOV CHAINS - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

MARKOV CHAINS

Description:

Using the memory less property of Markov chains, we get ... A casino dealer uses a fair die most of the time, but occasionally switches to a ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 22
Provided by: nalinwickr
Category:

less

Transcript and Presenter's Notes

Title: MARKOV CHAINS


1
MARKOV CHAINS
  • For the sequence X (x1, x2, , xL), the
    probability of the sequence is
  • Using the memory less property of Markov chains,
    we get
  • where p(x1) is the probability of starting in a
    particular state.
  • Add begin and end states with the corresponding
    symbols x0 and xL1. Define p(s) as the initial
    probability of symbol s,

2
MARKOV CHAINS
  • The probability of the sequence becomes,
  • Arrows represent transition probabilities.
  • Each state emits the corresponding symbol, i.e.,
    there is one to one correspondence between
    symbols and states.

A Markov Chain for modeling a DNA sequence
3
HIDDEN MARKOV MODEL
  • HMM is a triplet M (?, Q, ?) where,
  • ? is an alphabet of symbols.
  • Q is a set of states capable of emitting symbols
    from the alphabet ?.
  • ? is a set of probabilities comprising of,
  • State transition probabilities, akl for each k, l
    ? Q.
  • Emission probabilities, ek(b) for each k ? Q and
    b ? ?.
  • A path ? (?1,, ?L) is a sequence of states
    with the corresponding symbol sequence X (x1,
    , xL).
  • The path itself follows a Markov chain (i.e.,
    memory less).

4
HIDDEN MARKOV MODEL
  • State transition probabilities
  • Emission probabilities
  • The probability that the sequence X was
    generated by the model M given the path ? is
  • where ?0 begin state and ?L1 end state.

5
EXAMPLE HMM FOR MODELING DISHONEST CASINO
  • A casino dealer uses a fair die most of the time,
    but occasionally switches to a loaded die.
    Assume,
  • With the loaded die probability of a six 0.5,
    all other numbers have probability of 0.1
  • Probability of switching from fair to loaded die
    0.05 at each roll.
  • Probability of switching from loaded to fair die
    0.1 at each roll.
  • Switching between dice is a Markov process.
  • In each state of the Markov process, the outcomes
    have different probabilities.
  • The whole process is a HMM.

6
EXAMPLE DISHONEST CASINO
  • There are two possible states Fair and Loaded Q
    F, L.
  • There are six possible outcomes ? 1, 2, 3, 4,
    5, 6.
  • The transition probabilities are shown by arrows.
  • The emission probabilities are shown inside each
    state box.

7
DECODING PROBLEM MOST PROBABLE STATE PATH
  • Given the HMM M (?, Q, ?) and a sequence of
    symbols X ? ?, for which the generating path ?
    (?1,, ?L) is unknown,
  • In general, there could be many state sequences ?
    that could give rise to the particular sequence
    of symbols X.
  • Find the most probable generating path ? for X,
    i.e. a path such that p (X, ?) is maximized.

8
MOST PROBABLE STATE PATH
  • The solution ? will reveal the hidden states
    that generated the sequence X.
  • Dishonest casino case
  • All parts of ? that pass through state L are
    suspected rolls of the loaded die.
  • A solution for the most probable path is given by
    the Viterbi algorithm.

9
VITERBI ALGORITHM
  • Let X be a path of length L. For k ? Q and 0 ? i
    ? L, consider a path ? ending at k. If vk(i ) is
    the probability of most probable path that ends
    in state k.
  • Initialize
  • Recursive relation For each i 0, , L -1 and
    for each l ? Q
  • The value of p (X, ?) is given by

10
VITERBI ALGORITHM
  • By keeping pointers backwards, the most optimum
    state sequence can be found on backtracking.
  • Start backtracking from the state where vk(L ) is
    maximum for all k ? Q.
  • Predicted states by Viterbi algorithm on the
    casino example for 300 rolls of a die

Rolls 315116246446644245311321631164152133625144
54363165662656666665116645313265124563666463163666
31623264 Die FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFLLLLLLLLLLLLLLLLLLLLLFFFFFFFFFFFFLLLLL
LLLLLLLLLLLFFFLLL Viterbi FFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFLLLLLLLLLLLLLLLLLLFFFFFFFF
FFFFLLLLLLLLLLLLLLLLLLLLLL Rolls
55236266666625151631222555441666566563564324364131
51346514635341112641462625335636616366646623253441
Die LLLLLLLLLLLFFFFFFFFFFFFFFFFFLLLLLLLLLLLLL
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFLLLLLLLLLLFFF
FFFFFFFFF Viterbi LLLLLLLLLLLLFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFPFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFLLL
LLLLLLLLLLFFFFFFFF Rolls 3661661163252562462255
26525226643535333623312162536441443233516324363366
5562466662632666612355245242 Die
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFLLLLLLLLLLLLLLLLLLLLLLFFFFFFFFFFF
Viterbi FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFLLLLLLLLLLLLLLLLLLLFF
FFFFFFFFF
11
THE FORWARD ALGORITHM
  • We want to be able to calculate the probability
    of a given sequence, (as we did for a Markov
    chain), but under HMM conditions.
  • However, many different state paths can give rise
    to the same sequence X in the case of a HMM.
  • So add the probabilities of all possible paths
    to get the final probability.

12
THE FORWARD ALGORITHM
  • Given the sequence X (x1, , xL), denote fk (i
    ) to be the probability of emitting prefix (x1,
    , xi) and eventually reaching the state ?i k
  • fk (i ) is the probability of the given sequence
    up to xi requiring that ?i k
  • Initial values

13
THE FORWARD ALGORITHM
  • Recursive relation
  • Terminal value
  • Unlike the Viterbi algorithm, we have sums of
    probabilities.
  • So, logarithms could not be used easily to avoid
    underflow errors.
  • Use exponential functions (see notes) or a
    scaling method.

14
THE BACKWARD ALGORITHM
  • Complementary to the forward algorithm.
  • Denote bk (i ) to be the probability of emitting
    suffix (xi1, , xL) given ?i k
  • Initial Values
  • Recursive relation
  • Terminal value

15
THE POSTERIOIR DECODING PROBLEM
  • The Viterbi algorithm finds the most probable
    path through the model given a sequence of
    symbols.
  • However, in general we want to find the
    probability that the observation xi came from
    state k, given the observed sequence.
  • This is called posterior probability of state k
    at step i when the emitted sequence in known.
  • Posterior probability is particularly useful when
    many different paths compete for the most
    probable path with almost the same probability.
  • With posterior probability, we can ask questions
    like - Does Nth measurement in the sequence come
    from an enemy aircraft or not?

16
POSTERIOIR PROBABILITY
  • Posterior probability is obtained by using
    forward and backward probabilities.
  • By the definition of conditional probability,
    (PABPA,B/PB),
  • where p(X) is the result of either forward or
    backward calculation.

17
POSTERIOIR PROBABILITY
  • The posterior probability of the die being fair
    in the casino example can be calculated for each
    roll of a die.

X-axis no. of rolls, Y-axis p (die is fair)
The shaded areas show the rolls generated by
loaded die.
18
PARAMETER ESTIMATION FOR HMM
  • All examples considered so far assume that
    transmission and emission probabilities (? in HMM
    model) are known beforehand.
  • In practice, we do not know these HMM model
    parameters to begin with.
  • If we have a set of sample sequences X1, , Xn of
    lengths L1, , Ln, (called training sequences)
    then we can construct the HMM that will best
    characterize the training sequences.
  • Our goal is to find ? such that the logarithmic
    scores of the training sequences are maximized.

19
ESTIMATION WHEN STATE SEQUENCE KNOWN
  • Assume that state sequences ?1, , ?n are known.
  • First scan the sequences and compute
  • Akl no. of transitions from state k to l, and
  • Ek(b) no. of times symbol b was emitted in
    state k.
  • Then the maximum likelihood estimations are

20
ESTIMATION WHEN STATE SEQUENCE UNKNOWN
  • Called Baum-Welch training algorithm - an
    iterative technique.
  • Initialize by assigning arbitrary values to ?.
  • compute the expected no. of state transitions
    from k to l using,
  • then the expectations are,
  • where fkj (i ) and bkj(i ) are the forward and
    backward probabilities of the sequence Xj.

21
ESTIMATION WHEN STATE SEQUENCE UNKNOWN
  • compute the expected no. of emissions of symbol
    b in the state k using,
  • Maximization Re-compute the new values for ?
    from Akl and Ek(b), as in the case of known state
    sequence.
  • Repeat steps 2 and 3 until the improvement of
    is less than a given
    parameter ?.
Write a Comment
User Comments (0)
About PowerShow.com