Covariation and weighting of harmonically decomposed streams for ASR

1 / 20
About This Presentation
Title:

Covariation and weighting of harmonically decomposed streams for ASR

Description:

Most speech sounds are either voiced or unvoiced, which have very different properties: ... cat. PSHF. http://www.ee.surrey.ac.uk/Personal/P.Jackson/Columbo ... –

Number of Views:19
Avg rating:3.0/5.0
Slides: 21
Provided by: philipj5
Category:

less

Transcript and Presenter's Notes

Title: Covariation and weighting of harmonically decomposed streams for ASR


1
Covariation and weighting of harmonically
decomposed streams for ASR
  • Introduction
  • Pitch-scaled harmonic filter
  • Recognition experiments
  • Results
  • Conclusion

Production of /z/
aperiodic
periodic
2
Motivation and aims
  • Most speech sounds are either voiced or unvoiced,
    which have very different properties
  • voiced quasi-periodic signal from phonation
  • unvoiced aperiodic signal from turbulence noise
  • Do these properties allow humans to recognize
    speech in noise?
  • Maybe, we can use this information to help ASR...
  • by computing separate features for the two parts.
  • Are their two contributions complementary?

INTRODUCTION
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
3
Voiced and unvoiced parts of a speech signal
  • Production of /z/

aperiodic contribution
periodic contribution
INTRODUCTION
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
4
Pitch-scaled harmonic filter
s(n)
time shifting
. . .
PSHF
PSHF
PSHF
aperiodic waveform
periodic waveform
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
5
Decomposition example (waveforms)
Original
Periodic
Aperiodic
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
6
Decomposition ex. (spectrograms)
Original
Periodic
Aperiodic
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
7
Decomposition ex. (MFCC specs.)
Original
Periodic
Aperiodic
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
8
Speech database Aurora 2.0
  • From TIdigits database of connected English digit
    strings (male female speakers), filtered with
    G.712 at 8 kHz.

TRAIN
TEST
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
9
Description of the experiments
  • Baseline experiment base
  • standard parameterisation of the original
    waveforms (i.e., MFCC,?,??)
  • PCA experiments pca26, pca78, pca13 and pca39
  • decorrelation of the feature vectors, and
    reduction of the number of coefficients
  • Split experiments split, split1
  • adjustment of stream weights (periodic vs.
    aperiodic)
  • Caveat pitch values were derived from clean
    speech files, for entire database!

METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
10
Parameterisations
METHOD
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
11
Full-sized PCA results
RESULTS
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
12
Variance of Principal Components
PCA26
PCA39
clean
multi
RESULTS
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
13
PCA26 experiments results
CLEAN
MULTI
14
Summary of best PCA results
RESULTS
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
15
Split experiments results
16
Sample Split results
Note same value of stream weights used in
training as in testing, for Split.
RESULTS
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
17
Split1 experiments results
18
Summary of PCA Split results
RESULTS
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
19
Conclusions
  • PSHF module split Auroras speech waveforms into
    two synchronous streams (periodic and aperiodic)
  • large improvements over the single-stream
    Baseline
  • Split was better than all PCA combinations
  • PCA26/13 better than PCA 78/39, and PCA13 best
  • Split1 marginally better than Split
  • Periodic speech segments give robustness to noise.
  • Further work
  • Modeling how best to combine the streams?
  • LVCSR evaluate front end on TIMIT (phone
    recognition).
  • Robust pitch tracking

CONCLUSION
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
20
COLUMBO PROJECT Harmonic decomposition applied
to ASR
Philip J.B. Jackson 1 ltp.jackson_at_surrey.ac.ukgt Dav
id M. Moreno 2 ltdavidm_at_talp.upc.esgt Javier
Hernando 2 ltjavier_at_talp.upc.esgt Martin J.
Russell 3 ltm.j.russell_at_bham.ac.ukgt
http//www.ee.surrey.ac.uk/Personal/P.Jackson/Colu
mbo/
1
2
3
Write a Comment
User Comments (0)
About PowerShow.com