Speaker Verification Using Series of LVQ Networks

About This Presentation

Title:

Speaker Verification Using Series of LVQ Networks

Description:

Automated biometric authentication. Speaker recognition in ... Biometrics measurement of physiological and behavioral characteristics for authentication ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 11

Provided by: blah1

Category:

more less

Transcript and Presenter's Notes

Title: Speaker Verification Using Series of LVQ Networks

1
Speaker Verification Using Series of LVQ Networks
ECE874 Introduction to Neural Networks Curse
Presentation

Pálinkó Oszkár

2
Overview

Automated biometric authentication
Speaker recognition in general
Speech modeling
Speaker verifier structure
LVQ training
Decision making
Results and conclusions

3
Automated biometric authentication

Non-biometric authentication
Biometrics measurement of physiological and
behavioral characteristics for authentication
Authentication based on pattern recognition
Recognition methods fingerprint, hand geometry,
retina, iris, face, signature, speaker

4
Speaker recognition

Based on the speech and way of speaking
A natural, non-intrusive method
Both physiological and behavioral characteristic
Verification vs. Identification
Main steps speech modeling (feature extraction),
training (classification), recognition

5
Speech modeling

Feature extraction
Appropriate features for authentication
mel-frequency cepstral coefficients (MFCC) and
the pitch
Generating the MFCCs

Segmentation and windowing
DFT S(?)
logS(?)
critical filter H(?)
Speech signal
IDFT cm(n)
6
Speaker Verifier Structure

The verification system is based on a sequence of
n LVQ networks

LVQ1
LVQ2
DECISION LOGIC
SPEECH MODELING
SPEECH SIGNAL
MFCC
pitch
LVQn
7
LVQ Training

Speaker vs. one of the impostors
The input vector - 16th order MFCC and the pitch
18 codewords assigned to one of the final classes
45 seconds of training data

LVQ layer
W
x1
speaker
x2
impostor
x17
8
Decision making

LVQ - highly discriminative
Decision making based on 3 second segments,
every network gives a probability value in this
interval
Every ANN has to perform over 50, half of those
over 60

9
Results and Conclusions

Eight subjects, 8kHz data, 45s training, 3s
recognition
96 final recognition accuracy
Further development higher quality input data
(16kHz), use of delta mel coefficients,
Markov-modeling

10
Thank you for your attention.

Write a Comment

User Comments (0)