Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI

Description:

Research and application topics related to COST-277: Speech production and ... Auditory prosthesis (Alain Goy and Jacques Prado) Speech analysis and synthesis ... – PowerPoint PPT presentation

Number of Views:42

Avg rating:3.0/5.0

Slides: 24

Provided by: chol5

Category:

more less

Transcript and Presenter's Notes

Title: Some activities on Non-linear Speech Processing at ENST/CNRS-LTCI

1
Some activities on Non-linear Speech Processing
at ENST/CNRS-LTCI
Gérard CHOLLETchollet_at_tsi.enst.fr
ENST/CNRS-LTCI46 rue Barrault75634 PARIS cedex
13http//www.tsi.enst.fr/chollet
2
Outline

What is ENST/CNRS-LTCI ?
Research and application topics related to
COST-277
Speech production and perception,
Speech analysis and synthesis,
Speech coding
The SYMPATEX project
Automatic speech recognition
The SIROCCO project
Speaker characterisation and verification
Perspectives within COST-277

3
Our affiliations
ENST Ecole Nationale Supérieure des
Télécommunicationshttp//www.enst.fr CNRS
Centre National de la Recherche
Scientifiquehttp//www.cnrs.fr LTCI
Laboratoire de Traitement et Communication de
lInformation http//www.enst.fr/externe/ura.html
4
What is ENST?Ecole Nationale de
Télécommunications

classed among the
Grandes Ecoles d'Ingénieurs.
250 state certified engineers
each year .
part of Groupement des Ecoles
de Télécommunications

5
GET Groupement des Ecoles de Télécommunications

ENST-Paris ( )
ENST-Bretagne in Brest
Institut National des Télécommunications in Evry
EURECOM in Sophia-Antipolis
ENIC (Ecole Nouvelle dIngénieurs en Télécoms) in
Lille
Internet school in Marseille

6
Speech Production and Perception

Parametric Vocal Tract model (Shinji Maeda)
Non-linear Production model using Distinctive
Regions and Modes (René Carré)
Quantal nature of speech (R. Carré and S. Maeda)
Perceptual filter (Nicolas Moreau)
Auditory prosthesis (Alain Goyé and Jacques Prado)

7
Speech analysis and synthesis

Time-Frequency representations, Wavelets
Time-dependent spectral models (Yves Grenier)
HNM (Harmonics Noise Model)(Olivier Cappé,
Eric Moulines, Maurice Charbit)
Glottal Excited LPC

8
Time-dependent Spectral Models

Temporal Decomposition (B. Atal, 1983)
Vectorial Autoregressive models with detection of
model ruptures (A. DeLima, Y. Grenier)
Segmental parameterisation using a time-dependent
polynomial expansion (Y. Grenier)

9
Temporal Decomposition
10
HNM Harmonics Noise Model
11
A L I S P

A utomatic
L anguage
I ndependent
S peech
P rocessing

Automatic discovery of segmental units for
speech coding, synthesis, recognition,
language identification and speaker verification.
12
Speech Coding by indexing
SYMPATEX
SYstème de Messagerie unifiée avec présentation
vocale des messages (PArole et TEXte)
Thomson-CSF, ELAN TTS, Irius
GET, ESIEE
13
Coding principle ?
14
? Decoding
15
Automatic Speech Recognition

Recognition of proper names and spellings
Keyword spotting, noise robustness, adaptation
Large Vocabulary Speech Recognition (SIROCCO)
http//perso.enst.fr/sirocco/index-en.htm
l
Markov Random Fields, Bayesian Networks and
Graphical Models

16
Markov Random Fields Bayesian Networks and
Graphical Models

Speech modelling with state constrained
Markov Random Field over Frequency bands
(Guillaume Gravier and Marc Sigelle)
http//perso.enst.fr/ggravier/recherche.h
tmlthese
Comparative framework to study MRF,
Bayesian Networks and Graphical Models.
http//www.cs.berkeley.edu/murphyk/Bayes/bay
es.html

17
Speaker Verification

Typology of approaches (EAGLES Handbook)
Text dependent
Public password
Private password
Customized password
Text prompted
Text independent
Incremental enrolment
Evaluation

18
Speaker Verification (text independent)

The ELISA consortium
ENST, LIA, IRISA, ...
http//www.lia.univ-avignon.fr/equipes/RAL/elisa/i
ndex_en.html
NIST evaluations
http//www.nist.gov/speech/tests/spk/index.htm

19
Support Vector Machines and Speaker
Verification

Hybrid GMM-SVM system is proposed
SVM scoring model trained on development data to
classify true-target speakers access and
impostors access,using new feature
representation based on GMMs

20
SVM principles
21
Results
22
Voice technology in Majordome

Server side background tasks
continuous speech recognition applied to voice
messages upon reception
Detection of senders name and subject
User interaction
Speaker identification and verification
Speech recognition (receiving user commands
through voice interaction)
Text-to-speech synthesis (reading text summaries,
E-mails or faxes)

23
Perspectives within COST-277

Text-book on Speech Processing
Evaluation of parametric representations of
speech for diverse applications
Fundamental work on voice transformations with
applications in coding, synthesis, recognition
and speaker characterisation
Fundamental work on noise robustness with
applications in coding, recognition and speaker
verification

Write a Comment

User Comments (0)