Chap 6' Speech signal Representations

About This Presentation

Title:

Chap 6' Speech signal Representations

Description:

impulse train with same period in cepstrum-domain. Cepstrum of windowing signal. Example ... candidates in each frame. define a cost function. Viterbi search ... – PowerPoint PPT presentation

Number of Views:82

Avg rating:3.0/5.0

Slides: 27

Provided by: yihru

Category:

more less

Transcript and Presenter's Notes

Title: Chap 6' Speech signal Representations

1
Chap 6. Speech signal Representations

Short-time Fourier analysis
(1) Speech signal is time-variant
(2) Short-time stationary
(3) time domain ? Frequency domain using FFT
(4) Windowing function (FFT size)
Hamming window was the most frequently
used one.
For 8KHz, window size 256 (240, 30ms)
(zero-pending)
For 16KHz, window size 512 (480, 30ms)
(5) Spectrograms
Narrow band, wide band spectrogram (FFT
resolution, output filter)

2
Source filter Model
Nostrils all-pole model is not good enough
The lips model Usually using 1-?z-1 ? 0.9,
0.95, 0.97
All-poles model (LPC)
3
Linear Prediction Coding (LPC)

Linear prediction (AR model)
According to losses tube model (lattice formula
which will introduce later)
8KHz sampling, c340 m/s, L 17cm ? N8 (2 poles
for 1KHz)

Solve the linear prediction coefficients using
MMSE criterion

Covariance method

Solution of covariance method using Cholesky
decomposition
(1) solve V, D

(2) solve A

Autocorrelation Method

R is Toepliz
Using Levinson Dubins algorithm

The algorithm is to transfer Ladder filter ?
Lattice filter lattice filter ? cascade form Ei
the square error of prediction Ki the
coefficients of lattice filter refection
coefficients
10

Lattice filter
Define forward/backward prediction errors

LPC using Lattice filter

12
Spectral analysis vs. LPC

LPC spectrum

Prediction error vs. LPC order

14
Conversion between parameters

Reflection coefficients vs. LPC
Log-area ratios

15
Cepstral processing

Spectral vs. Cepstral
Cepstral is a homomorphic transformation
(de-convolution)
The Block diagram

Cepstral of real signal

Cepstrum of pole-zero function

18
LPC derived cepstrum

Cepstrum of speech signal
periodic excitation train
? impulse train with same period in
cepstrum-domain
Cepstrum of windowing signal
Example
(from Fundamentals of Speech Recognition, by B.
H. Juang)

20
Mel-frequency Cepstrum (MFCC)

Change the frequency scale into Mel-scale

Frequency quantiztion?
21

M20 for Fs8KHz, 24 for Fs16KHz

22
Pitch detector of speech signal

Speech signal is a quasi-periodic signal, because
the speech is a time-variated signal.
Find the pitch frequency (Fundamental freq., F0)
? find the period of a discrete signal.
Assume the pitch contour will continue
? find a smooth pitch contour
? smoothing/contour tracking algorithm is
needed.