Learning Structured Models for Phone Recognition - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Learning Structured Models for Phone Recognition

Description:

Learning Structured Models for Phone Recognition Slav Petrov, Adam Pauls, Dan Klein – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 37
Provided by: EEC60
Category:

less

Transcript and Presenter's Notes

Title: Learning Structured Models for Phone Recognition


1
Learning Structured Models for Phone Recognition
  • Slav Petrov, Adam Pauls, Dan Klein

2
Acoustic Modeling
3
Motivation
  • Standard acoustic models impose many structural
    constraints
  • We propose an automatic approach
  • Use TIMIT Dataset
  • MFCC features
  • Full covariance Gaussians

(Young and Woodland, 1994)
4
Phone Classification
5
Phone Classification
æ
6
HMMs for Phone Classification
7
HMMs for Phone Classification
Temporal Structure
8
Standard subphone/mixture HMM
Temporal Structure
Gaussian Mixtures
Model Error rate
HMM Baseline 25.1
9
Our Model
Standard Model
Fully Connected
Single Gaussians
10
Hierarchical Baum-Welch Training
32.1
28.7
HMM Baseline 25.1
5 Split rounds 21.4
11
Phone Classification Results
Method Error Rate
GMM Baseline (Sha and Saul, 2006) 26.0
HMM Baseline (Gunawardana et al., 2005) 25.1
SVM (Clarkson and Moreno, 1999) 22.4
Hidden CRF (Gunawardana et al., 2005) 21.7
Our Work 21.4
Large Margin GMM (Sha and Saul, 2006) 21.1
12
Phone Recognition
13
Standard State-Tied Acoustic Models
14
No more State-Tying
15
No more Gaussian Mixtures
16
Fully connected internal structure
17
Fully connected external structure
18
Refinement of the /ih/-phone
19
Refinement of the /ih/-phone
20
Refinement of the /ih/-phone
21
Refinement of the /ih/-phone
22
Refinement of the /l/-phone
23
Hierarchical Refinement Results
HMM Baseline 41.7
5 Split Rounds 28.4
24
Merging
  • Not all phones are equally complex
  • Compute log likelihood loss from merging

Split model
Merged at one node
25
Merging Criterion
26
Split and Merge Results
Split Only 28.4
Split Merge 27.3
27
HMM states per phone
28
HMM states per phone
29
HMM states per phone
30
Alignment
Results
Hand Aligned 27.3
Auto Aligned 26.3
31
Alignment State Distribution
32
Inference
  • State sequence
  • d1-d6-d6-d4-ae5-ae2-ae3-ae0-d2-d2-d3-d7-d5
  • Phone sequence
  • d - d - d -d -ae - ae - ae - ae - d - d -d - d -
    d
  • Transcription
  • d - ae -
    d

Viterbi
Variational
???
33
Variational Inference
Variational Approximation
Viterbi 26.3
Variational 25.1
34
Phone Recognition Results
Method Error Rate
State-Tied Triphone HMM (HTK) (Young and Woodland, 1994) 27.7
Gender Dependent Triphone HMM (Lamel and Gauvain, 1993) 27.1
Our Work 26.1
Bayesian Triphone HMM (Ming and Smith, 1998) 25.6
Heterogeneous classifiers (Halberstadt and Glass, 1998) 24.4
35
Conclusions
  • Minimalist, Automatic Approach
  • Unconstrained
  • Accurate
  • Phone Classification
  • Competitive with state-of-the-art discriminative
    methods despite being generative
  • Phone Recognition
  • Better than standard state-tied triphone models

36
Thank you!
  • http//nlp.cs.berkeley.edu
Write a Comment
User Comments (0)
About PowerShow.com