HIWIRE MEETING Athens, November 34, 2005 - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

HIWIRE MEETING Athens, November 34, 2005

Description:

HIWIRE Meeting Athens, 3 - 4 November, 2005. Schedule. HIWIRE database evaluations ... gsub('Mayday_Mayday','Mayday Mayday',linea); gsub('Pan_Pan','Pan Pan',linea) ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 39
Provided by: cvspC
Category:

less

Transcript and Presenter's Notes

Title: HIWIRE MEETING Athens, November 34, 2005


1
HIWIRE MEETINGAthens, November 3-4, 2005
  • José C. Segura, Ángel de la Torre

2
Schedule
  • HIWIRE database evaluations
  • Non-linear feature normalization
  • ECDF segmental implementation
  • Parametric equalization
  • Robust VAD
  • Bispectrum-based VAD
  • Model-based feature compensation
  • VTS results on AURORA4
  • Including uncertainty caused by noise

3
HIWIRE database evaluations
  • PARAMETERS MFCC_0_D_A_Z (39 component)
  • MODELS
  • TIMIT 46 phone models / 3 states / 128 Gaussians
    (17.664 G)
  • WSJ16k 16.825 triphones / 3.608 tied-states / 6
    Gaussians (21.648 G)
  • WSJ16kFon 40 phone models / 3 states / 128
    Gaussians (15.360 G)
  • ADAPTATION
  • MLLR 32 regression classes / 50 adaptation
    utterances
  • GRAMMAR
  • LORIA Word-Loop
  • MODIFICATIONS Some transcriptions have been
    modified to match the grammar definition

4
Transcription modifications
BEGIN lista LISTA nfrase 0
linea0 gsub("-","_",linea)
gsub("Due_to_","Due_to ",linea)
gsub("Mayday_Mayday","Mayday Mayday",linea)
gsub("Pan_Pan","Pan Pan",linea) gsub("three
hundred twenty","three_hundred_twenty",linea)
gsub("one hundred sixty","one_hundred_sixty",linea
) printf("s\n",tolower(linea)) nfrase
nfrase1
5
HIWIRE database results
6
Schedule
  • HIWIRE database evaluations
  • Non-linear feature normalization
  • ECDF segmental implementation
  • Parametric equalization
  • Robust VAD
  • Bispectrum-based VAD
  • Model-based feature compensation
  • VTS results on AURORA4
  • Including uncertainty caused by noise

7
ECDF segmental implementation
  • ECDF segmental implementation
  • Provided LOQUENDO with a reference C
    implementation of segmental Gaussian
    transformation to be tested within LOQUENDO
    recognizer
  • Current work
  • Nonlinear feature transformation with a clean
    reference to avoid the problem of system
    retraining

8
Parametric Equalization (1)
PARAMETRIC NONLINEAR FEATURE EQUALIZATION FOR
ROBUST SPEECH RECOGNITION (submitted ICASSP06)
  • HEQ limitations
  • Influence of relative amount of silence in
    utterances
  • With a parametric model, a more robust
    equalization can be obtained

9
Parametric Equalization (2)
10
Parametric Equalization (3)
11
Parametric Equalization (4)
  • In comparison with HEQ, PEQ transformations are
    smoother
  • For C0 a monotonic transformation is obtained
  • For other coefficients, the interpolated
    transformation is not monotonic

12
Parametric Equalization (5)
  • BASE
  • MFCC_0_D_A_Z (39 component)
  • HEQ
  • Quantile based CDF-transformation
  • Clean reference
  • Implemented over MFCC_0 / CMS and regressions
    computed after HEQ
  • AFE
  • Standard implementation
  • PEQ
  • Clean reference
  • Implemented over MFCC_0 / CMS and regressions
    computed after PEQ

13
Parametric Equalization (6)
  • Current work
  • Development of an on-line version
  • Relax the diagonal covariance assumption
  • Investigate the normalization of dynamic features
  • Using a more detailed model of speech frames
  • (i.e. More than one Gaussian)

14
Schedule
  • HIWIRE database evaluations
  • Non-linear feature normalization
  • ECDF segmental implementation (LOQ)
  • Parametric equalization
  • Robust VAD
  • Bispectrum-based VAD
  • Model-based feature compensation
  • VTS results on AURORA4
  • Including uncertainty caused by noise

15
Bispectrum-based VAD (1)
  • Motivations
  • Ability of higher order statistics to detect
    signals in noise
  • Polyspectra methods rely on an a priori knowledge
    of the input processes
  • Issues to be addressed
  • Computationally expensive
  • Variance of the bispectrum estimators is much
    higher than that of power spectral estimators for
    identical data record size
  • Solution Integrated bispectrum
  • J. K. Tugnait, Detection of non-Gaussian signals
    using integrated polyspectrum, IEEE Trans. on
    Signal Processing, vol. 42, no. 11, pp.
    31373149, 1994.
  • Computationally efficient and reduced variance
    statistical test based on the integrated
    polyspectra
  • Detection of an unknown random, stationary,
    non-Gaussian signal in Gaussian noise

16
Bispectrum-based VAD (2)
  • Integrated bispectrum
  • Defined as a cross spectrum between the signal
    and its square, and therefore, it is a function
    of a single frequency variable
  • Benefits
  • Its computation as a cross spectrum leads to
    significant computational savings
  • The variance of the estimator is of the same
    order as that of the power spectrum estimator
  • Properties
  • Bispectrum of a Gaussian process is identically
    zero, its integrated bispectrum is as well

17
Bispectrum-based VAD (3)
  • Two alternatives explored for formulating the
    decision rule
  • Estimation by block averaging
  • MO-LRT
  • Given a set of N 2m1 consecutive observations

18
Bispectrum-based VAD (4)
  • Likelihoods
  • Variances

19
Bispectrum-based VAD results (1)
20
Bispectrum-based VAD results (2)
21
Bispectrum-based VAD results (3)
22
Schedule
  • HIWIRE database evaluations
  • Non-linear feature normalization
  • ECDF segmental implementation (LOQ)
  • Parametric equalization
  • Robust VAD
  • Bispectrum-based VAD
  • Model-based feature compensation
  • VTS results on AURORA4
  • Including uncertainty caused by noise

23
Schedule
  • Model-based feature compensation
  • VTS results on AURORA4
  • VTS formulation
  • VTS vs non linear feature normalization
    procedures
  • VTS results on AURORA 4
  • Including uncertainty caused by noise
  • Including uncertainty in noise compensation
  • Wiener filtering uncertainty results on Aurora
    2
  • Wiener filtering uncertainty results on Aurora
    4
  • VTS uncertainty formulation
  • Numerical integration of probabilities
    formulation

24
VTS formulation
  • VTS Vector Taylor Series approach to remove
    additive (and channel) noise
  • References
  • P.J. Moreno. Speech recognition in noisy
    environments Ph.D. Thesis, Carnegie-Mellon
    University, Pittsburgh, Pensilvania, Apr. 1996.
  • A. de la Torre. Técnicas de mejora de la
    representación en los sistemas de reconocimiento
    automático del habla Ph.D. Thesis, University of
    Granada, Spain, Apr. 1999.

25
VTS formulation
  • VTS provides an estimation of the clean speech in
    a statistical framework
  • Log-FBO domain, assumed additive noise
  • Effect of noise described using the correction
    function g()

26
VTS formulation
  • Auxiliary functions f() and h() 1st and 2nd
    derivatives
  • VTS provides estimation of noisy-speech Gaussian
    given the clean-speech and the noise Gaussians
  • Noisy-speech Gaussian obtained with the expected
    values

27
VTS formulation
  • Noisy-speech Gaussian formulas
  • Models for noise and clean speech

28
VTS formulation
  • Model for clean speech provides the model for
    noisy speech, and also P(ky) (posterior
    probability of each Gaussian)
  • Estimation of clean speech

29
VTS vs non-linear feature normalization
  • VTS
  • Statistical framework
  • Model for noise in log-FBO domain 1 Gaussian PDF
  • Model for clean-speech in log-FBO domain
    Gaussian mixture
  • Noise assumed to be additive in FBO domain
  • Accurate description of noise process
  • ACCURATE COMPENSATION
  • Non-linear feature normalization
  • No a-priori assumption
  • Component-by-component
  • MORE FLEXIBLE, LESS ACCURATE

30
VTS results on AURORA 4
31
Including uncertainty in noise compensation
  • Noise is a random process we do not know n,
    but p(n)
  • Then, from an observation y we cannot find x,
    but p(xy,?x,?n)
  • Usually, compensation procedures provide
    Exy,?x,?n
  • What about uncertainty of x ?
  • Mean and variance of x

32
Including uncertainty in noise compensation
33
Including uncertainty in noise compensation
  • An approach for the estimation of the variance
  • Evaluation of HMM Gaussians

34
Wiener filt. uncertainty results on AURORA 2
  • Preliminary results with Wiener filtering
  • Results on Aurora 2 with Wiener filtering
    uncertainty

35
Wiener filter uncertainty results on AURORA 4
36
VTS uncertainty formulation
  • VTS based estimation of clean speech
  • VTS based estimation of variance

37
Numerical integration of probabilities
formulation
  • Computation of expected values
  • Numerical integration of expected values

38
HIWIRE MEETINGAthens, November 3-4, 2005
  • José C. Segura, Ángel de la Torre
Write a Comment
User Comments (0)
About PowerShow.com