Speaker Verification Using Series of LVQ Networks - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

Speaker Verification Using Series of LVQ Networks

Description:

Automated biometric authentication. Speaker recognition in ... Biometrics measurement of physiological and behavioral characteristics for authentication ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 11
Provided by: blah1
Category:

less

Transcript and Presenter's Notes

Title: Speaker Verification Using Series of LVQ Networks


1
Speaker Verification Using Series of LVQ Networks
ECE874 Introduction to Neural Networks Curse
Presentation
  • Pálinkó Oszkár

2
Overview
  • Automated biometric authentication
  • Speaker recognition in general
  • Speech modeling
  • Speaker verifier structure
  • LVQ training
  • Decision making
  • Results and conclusions

3
Automated biometric authentication
  • Non-biometric authentication
  • Biometrics measurement of physiological and
    behavioral characteristics for authentication
  • Authentication based on pattern recognition
  • Recognition methods fingerprint, hand geometry,
    retina, iris, face, signature, speaker

4
Speaker recognition
  • Based on the speech and way of speaking
  • A natural, non-intrusive method
  • Both physiological and behavioral characteristic
  • Verification vs. Identification
  • Main steps speech modeling (feature extraction),
    training (classification), recognition

5
Speech modeling
  • Feature extraction
  • Appropriate features for authentication
    mel-frequency cepstral coefficients (MFCC) and
    the pitch
  • Generating the MFCCs

Segmentation and windowing
DFT S(?)
logS(?)
critical filter H(?)
Speech signal
IDFT cm(n)
6
Speaker Verifier Structure
  • The verification system is based on a sequence of
    n LVQ networks

LVQ1
LVQ2
DECISION LOGIC
SPEECH MODELING
SPEECH SIGNAL
MFCC
pitch
LVQn
7
LVQ Training
  • Speaker vs. one of the impostors
  • The input vector - 16th order MFCC and the pitch
  • 18 codewords assigned to one of the final classes
  • 45 seconds of training data

LVQ layer
W
x1
speaker
x2
impostor
x17
8
Decision making
  • LVQ - highly discriminative
  • Decision making based on 3 second segments,
    every network gives a probability value in this
    interval
  • Every ANN has to perform over 50, half of those
    over 60

9
Results and Conclusions
  • Eight subjects, 8kHz data, 45s training, 3s
    recognition
  • 96 final recognition accuracy
  • Further development higher quality input data
    (16kHz), use of delta mel coefficients,
    Markov-modeling

10
Thank you for your attention.
Write a Comment
User Comments (0)
About PowerShow.com