LING124 Statistical approach to ASR: an overview - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

LING124 Statistical approach to ASR: an overview

Description:

Likelihood: acoustic model. Prior: language model. Schematic ... Likelihood acoustic model (2) Gaussian distribution is the most often used pdf (p307) ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 12
Provided by: hahn7
Category:

less

Transcript and Presenter's Notes

Title: LING124 Statistical approach to ASR: an overview


1
LING124 Statistical approach to ASR an overview
  • October 21, 2008

2
Class outline
  • Noisy channel model
  • Bayesian inference
  • Likelihood acoustic model
  • Prior language model
  • Schematic architecture

3
Noisy channel model
4
Bayesian inference (1)
  • Take every possible sentence in the language
  • Run each sentence through the channel and see if
    it matches the original noisy output
  • Pick the sentence whose channel output best
    matches the original output

5
Bayesian inference (2)
  • Oo1, o2, o3, ... , oT
  • Ww1, w2, w3, ... , wn
  • L set of all possible word sequences

6
Bayes rule
7
Bayesian inference (3)
8
Likelihood acoustic model (1)
  • P(OW)
  • Probability of the observation given word
  • Observation is typically continuous, rather than
    discrete, in statistical models of ASR
  • e.g. MFCC, delta, double-delta
  • Probability of a continuous random variable can
    be calculated if we have a probability
    distribution function (pdf)

9
Likelihood acoustic model (2)
  • Gaussian distribution is the most often used pdf
    (p307)
  • Most simply, one Gaussian distribution for each
    word (or some other linguistic unit)
  • Recall that a typical feature vector has 39
    components (MFCC, delta, double-delta)
  • We use a multivariate Gaussian distribution for
    each word (or some other linguistic unit)

10
Prior language model
  • P(W)
  • Probability of observing a word sequence W in the
    given language
  • P(w1, w2, ..., wn)
  • By chain rule,
  • Most often calculated using N-grams

11
Schematic architecture
Write a Comment
User Comments (0)
About PowerShow.com