Language Technology : Present and future of automatic speech recognition - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Language Technology : Present and future of automatic speech recognition

Description:

Language Technology : Present and future of automatic speech ... Access to information: travel, banking, portals. Commands: avionics, automobile, etc. ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 21
Provided by: JPH45
Category:

less

Transcript and Presenter's Notes

Title: Language Technology : Present and future of automatic speech recognition


1
Language Technology Present and future of
automatic speech recognition
  • Jean-Paul Haton
  • LORIA-INRIA
  • Université Henri Poincaré, Nancy 1
  • Institut Universitaire de France
  • jph_at_loria.fr

ILSP Oct. 2001 Athens
2
Outline
  • Evolution of ASR
  • Basic principles
  • A major problem robustness
  • State of the art
  • Future trends and perspectives

3
Applications of ASR
  • Data entry voice dictation, etc.
  • Access to information travel, banking, portals
  • Commands avionics, automobile, etc.
  • Speech transcription
  • Handicapped
  • Speech-to-speech translation

4
Speech recognition technology
5
Principle of Automatic Speech Recognition
6
Bayes Decision Rule
Speech
Parametrization
X x 1T
P(X/W)
Acoustic Models
Search max P(X/W) P(W) W w 1N

Language Models
P(W)
Word sequence
7
Speech modelized by HMMs
  • The speech production process is supposed to be
    Markovian

8
Robustness issues in ASR
  • Mismatches between training and operating
    conditions can lead to a significant loss in
    performance

9
Main causes of acoustic variation in speech
10
Influence of noise on recognition
11
Approaches to robustness
  • Three non-exclusive appoaches
  • Signal acquisition and denoising
  • Robust parameterization (less sensitive to noise
    variations)
  • Model modification adaptation

12
Speech Parametrization
  • Use of knowledge from perception,
    psychoacoustics, neurology, etc.
  • ear models
  • noise masking
  • dynamic parameters
  • analyses PLP, RASTA, RASTA-PLP, etc.
  • Cepstral Normalization

13
Multibands recognition
14
Acoustic Model Adaptation
  • Objective
  • To derive a new set of models that better match
    the operating environment
  • Principle
  • To use a small amount of adaptation data
    reflecting the operating environment to modify
    the original models

15
Examples of adaptation methods
  • Combination of models PMC (Parallel Model
    Combination)
  • Linear transformations of models. Example MLLR
    (Maximum Likelihood Linear Regression)
  • MAP (Maximum a posteriori) the a priori
    probability
  • of models is not uniform
  • Hybridations MLLR/MAP
  • Model grouping into classes (eigenvoices, etc.)
  • Non-linear transforms (neural networks, etc.)

16
Parallel Model Combination (PMC) Gales, 1995
17
MLLR example
i
u
u
i
a
18
MLLR speaker adaptation
19
State-of-the-art algorithms in speech recognition
Speech input
Acoustic analysis
Phoneme inventory
Pronunciation lexicon
Global search
Recognized Word sequence
20
Future research needs
  • Better modelling of speech variability
  • (structural and non-structural)
  • Better knowledge on speech perception
  • and adaptation by humans
  • Beyond recognition speech understanding
Write a Comment
User Comments (0)
About PowerShow.com