Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI

Description:

Research and application topics related to COST-277: Speech production and ... Auditory prosthesis (Alain Goy and Jacques Prado) Speech analysis and synthesis ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 24
Provided by: chol5
Category:

less

Transcript and Presenter's Notes

Title: Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI


1
Some activities on Non-linear Speech Processing
at ENST/CNRS-LTCI
GĂ©rard CHOLLETchollet_at_tsi.enst.fr
ENST/CNRS-LTCI46 rue Barrault75634 PARIS cedex
13http//www.tsi.enst.fr/chollet
2
Outline
  • What is ENST/CNRS-LTCI ?
  • Research and application topics related to
    COST-277
  • Speech production and perception,
  • Speech analysis and synthesis,
  • Speech coding
  • The SYMPATEX project
  • Automatic speech recognition
  • The SIROCCO project
  • Speaker characterisation and verification
  • Perspectives within COST-277

3
Our affiliations
ENST Ecole Nationale Supérieure des
Télécommunicationshttp//www.enst.fr CNRS
Centre National de la Recherche
Scientifiquehttp//www.cnrs.fr LTCI
Laboratoire de Traitement et Communication de
lInformation http//www.enst.fr/externe/ura.html
4
What is ENST?Ecole Nationale de
Télécommunications
  • classed among the
  • Grandes Ecoles d'IngĂ©nieurs.
  • 250 state certified engineers
  • each year .
  • part of Groupement des Ecoles
  • de TĂ©lĂ©communications

5
GET Groupement des Ecoles de Télécommunications
  • ENST-Paris ( )
  • ENST-Bretagne in Brest
  • Institut National des TĂ©lĂ©communications in Evry
  • EURECOM in Sophia-Antipolis
  • ENIC (Ecole Nouvelle dIngĂ©nieurs en TĂ©lĂ©coms) in
    Lille
  • Internet school in Marseille

6
Speech Production and Perception
  • Parametric Vocal Tract model (Shinji Maeda)
  • Non-linear Production model using Distinctive
    Regions and Modes (René Carré)
  • Quantal nature of speech (R. CarrĂ© and S. Maeda)
  • Perceptual filter (Nicolas Moreau)
  • Auditory prosthesis (Alain GoyĂ© and Jacques Prado)

7
Speech analysis and synthesis
  • Time-Frequency representations, Wavelets
  • Time-dependent spectral models (Yves Grenier)
  • HNM (Harmonics Noise Model)(Olivier CappĂ©,
    Eric Moulines, Maurice Charbit)
  • Glottal Excited LPC

8
Time-dependent Spectral Models
  • Temporal Decomposition (B. Atal, 1983)
  • Vectorial Autoregressive models with detection of
    model ruptures (A. DeLima, Y. Grenier)
  • Segmental parameterisation using a time-dependent
    polynomial expansion (Y. Grenier)

9
Temporal Decomposition
10
HNM Harmonics Noise Model
11
A L I S P
  • A utomatic
  • L anguage
  • I ndependent
  • S peech
  • P rocessing

Automatic discovery of segmental units for
speech coding, synthesis, recognition,
language identification and speaker verification.
12
Speech Coding by indexing
SYMPATEX
SYstème de Messagerie unifiée avec présentation
vocale des messages (PArole et TEXte)
Thomson-CSF, ELAN TTS, Irius
GET, ESIEE
13
Coding principle ?
14
? Decoding
15
Automatic Speech Recognition
  • Recognition of proper names and spellings
  • Keyword spotting, noise robustness, adaptation
  • Large Vocabulary Speech Recognition (SIROCCO)
  • http//perso.enst.fr/sirocco/index-en.htm
    l
  • Markov Random Fields, Bayesian Networks and
    Graphical Models

16
Markov Random Fields Bayesian Networks and
Graphical Models
  • Speech modelling with state constrained
  • Markov Random Field over Frequency bands
  • (Guillaume Gravier and Marc Sigelle)
  • http//perso.enst.fr/ggravier/recherche.h
    tmlthese
  • Comparative framework to study MRF,
  • Bayesian Networks and Graphical Models.
  • http//www.cs.berkeley.edu/murphyk/Bayes/bay
    es.html

17
Speaker Verification
  • Typology of approaches (EAGLES Handbook)
  • Text dependent
  • Public password
  • Private password
  • Customized password
  • Text prompted
  • Text independent
  • Incremental enrolment
  • Evaluation

18
Speaker Verification (text independent)
  • The ELISA consortium
  • ENST, LIA, IRISA, ...
  • http//www.lia.univ-avignon.fr/equipes/RAL/elisa/i
    ndex_en.html
  • NIST evaluations
  • http//www.nist.gov/speech/tests/spk/index.htm

19
Support Vector Machines and Speaker
Verification
  • Hybrid GMM-SVM system is proposed
  • SVM scoring model trained on development data to
    classify true-target speakers access and
    impostors access,using new feature
    representation based on GMMs

20
SVM principles
21
Results
22
Voice technology in Majordome
  • Server side background tasks
  • continuous speech recognition applied to voice
    messages upon reception
  • Detection of senders name and subject
  • User interaction
  • Speaker identification and verification
  • Speech recognition (receiving user commands
    through voice interaction)
  • Text-to-speech synthesis (reading text summaries,
    E-mails or faxes)

23
Perspectives within COST-277
  • Text-book on Speech Processing
  • Evaluation of parametric representations of
    speech for diverse applications
  • Fundamental work on voice transformations with
    applications in coding, synthesis, recognition
    and speaker characterisation
  • Fundamental work on noise robustness with
    applications in coding, recognition and speaker
    verification
Write a Comment
User Comments (0)
About PowerShow.com