The Exam - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

The Exam

Description:

An approved mathematical handbook (so far only Beta is approved. ... Short-time Fourier analysis, the spectrogram. Windowing. LP analysis. ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 19
Provided by: tmh2
Category:
Tags: exam | spectrogram

less

Transcript and Presenter's Notes

Title: The Exam


1
The Exam
  • The course is defined by the lectures, exercises,
    assignments, and the course book (the chapters
    are specified on the home page)
  • The following aids are allowed
  • An approved mathematical handbook (so far only
    Beta is approved. Show your favorite before the
    exam and it may be approved!)
  • A calculator, with the memory erased.
  • You dont need to sign up for the exam. (You must
    be registered at the course though check list!)

2
Structure of the Exam
  • 4 parts corresponding to the 4 parts of the
    course (assignment 1-4 roughly chapters, 6, 7,
    8-9, and 10).
  • Each part will have one question which requires a
    more verbal answer, and one question which
    requires some maths.
  • One question will consist of true or false
    questions. No motivations are allowed. (Correct
    answer 1p, wrong answer -1p.)

3
SSP Part 1 Speech production and analysis
  • Physiology of speech production.
  • Lossless tube model and its implications.
  • Basic phonetics.
  • Short-time Fourier analysis, the spectrogram.
  • Windowing.
  • LP analysis.
  • Determ./Stochastic modeling
  • Autocorrelation vs covariance method.
  • Cepstral analysis.

4
SSP Part 2 Speech Compression
  • Memoryless coding
  • Basic terminology, Uniform scalar quantizers,
    companding, gain adaptive quantization
  • Memory coding
  • DPCM, Transform coding
  • Masking in speech coding
  • Vector quantization
  • State of the art speech coders
  • Vocoders.
  • CELP.

5
SSP Part 3 Statistical Modeling of Speech
  • Gaussian mixture models
  • Log-likelihood as a modeling criterion. The EM
    algorithm.
  • Speaker identification.
  • Hidden Markov models
  • The forward algorithm.
  • Viterbi algorithm.
  • Isolated word recognition applications,
    small/large vocabulary.
  • (The notion of) continuous word recognition.

6
SSP Part 4 Environmental Robustness
  • Microphone arrays
  • Delay-and-sum beamforming.
  • Blind source separation
  • Maximum likelihood approach.
  • Decorrelation (whitening) is not enough.
  • Infomax algorithm makes components independent.
  • Single Channel Noise suppression
  • Wiener filter, Spectral subtraction.
  • Adaptive Echo Cancellation
  • LMS algorithm.

7
Speech production.
  • Physiology of speech production.
  • Vocal tract, vocal cords, nasal cavity, larynx,
    velum.
  • Lossless tube model and its implications.
  • AR modeling, formants.
  • Basic phonetics.
  • Voiced/unvoiced.
  • Vowels, consonants.
  • Phonemes, allophones.

8
Speech Analysis
  • Short-time Fourier analysis, the spectrogram.
  • Windowing. Resolution vs frame length.
  • How does sampling, windowing, and DFT calculation
    affect our picture of the frequency contents of
    real world signals.
  • LP analysis.
  • Terminology whitening filter, production filter
  • Deterministic and stochastic approach.
  • Autocorr. vs covariance method (deterministic
    approaches)
  • Levinson-Durbin (autocorr)
  • Spectral whiteness, , spectrum
    modeling.
  • LSFs, reflection coeffs, LARs, LPC cepstrum.
  • Cepstral analysis.
  • Source-filter separation.
  • Generalize this method to other applications.

9
Memoryless Quantization
  • Basics
  • Decision regions, reconstruction levels, index,
    rate (bits/sample, bits/second),
    encoding/decoding, instantaneous distortion -
    average distortion
  • Uniform scalar quantizers,
  • overload/granular distortion, midrise/midtread,
    stepsize, xmax, delta squared by 12! Error
    spectrum.
  • Companding,
  • robustness aspect, pdf matching aspect
    (companding gain)
  • Gain adaptivequantization,
  • bacward/forward adaptation.

10
Memory-Based Quantization
  • DPCM,
  • prediction gain, closed loop/open loop,
    quantization error spectrum.
  • Transform coding
  • rudimentary knowledge.
  • Masking in speech coding,
  • Physiology of the ear. Noise masked by tone.
  • Whitness of the quantization error contra a
    shaped error spectrum.
  • Vector quantization.
  • Benefits of VQ, optimization of VQs (K-means,
    GLA, LBG), split-VQ/multi-stage VQ, Spectrum VQ -
    spectral distortion.

11
State of the art speech coders
  • Vocoder.
  • Which parameters are used.
  • Drawbacks of our assignment vocoder.
  • CELP.
  • Analysis by synthesis
  • How is speech modeled in a CELP coder?
  • CELP encoding. Adaptive/fixed CB search.

12
Gaussian Mixture Models
  • Terminology
  • Components, weights, means, covariance matrices,
    diagonal model.
  • Log-likelihood as a modeling criterion.
  • What is the maximum log-likelihood.
  • When is it achieved.
  • The EM algorithm.
  • Intuitive feeling for why it works.
  • Comparison with VQ training.
  • Speaker identification.
  • Why is log-likelihood a good criterion to
    discriminate speakers?

13
Hidden Markov Models
  • The forward algorithm.
  • When is it used? Complexity?
  • Viterbi algorithm.
  • When is it used? Complexity?
  • Isolated word recognition applications,
    small/large vocabulary.
  • What kind of model structure?
  • The problem of insufficient training data.
  • Tying.
  • What criterion is used during runtime?
  • (The notion of) continuous word recognition.
  • Rudimentary knowledge.

14
Microphone Arrays
  • Delay-and-sum beamforming.
  • How calculate a directivity diagram?
  • Calculate the power of the delay-and-sum
    beamformer for a sinusoidal input.
  • Direction of arrival - steering angle.
  • DoA gives actual delays in the channels, steering
    is achieved by introducing artificial delays.
  • How is the beam affected by frequency - steering
    angle?
  • Spatial aliasing.

15
Blind Source Separation
  • Maximum likelihood approach.
  • Does it make sense?
  • Independent components is all that is needed!
    Almost...
  • Decorrelation (whitening) is not enough.
  • Modeling the components as Gaussian leads to
    decorrelation.
  • Decorrelation factorization of the data
    covariance matrix, which always leaves an unknown
    rotation.
  • Infomax algorithm makes components independent.
  • Use a model density which has unique higher order
    moments.
  • If we modify the higher order moments of the
    output density to fit the higher order moments of
    our model density (where components are
    independent), we get independent output
    components.

16
Single Channel Noise Suppression
  • The Wiener filter and Spectral subtraction.
  • Frequency domain implementations.
  • Noise spectrum estimation, VAD, minimum
    statistics.
  • Clean speech spectrum estimation.
  • Relation between the Wiener filter and spectral
    subtraction.
  • Musical noise. Why?

17
Adaptive Echo Cancellation
  • Double-talk detection
  • Is needed so that we know when to switch on the
    adaptation.
  • LMS algorithm.
  • Adaptive Wiener filter.
  • Convergence properties.
  • Time constant.
  • Normalized LMS.

18
Thank you, and good luck with the exam.
Write a Comment
User Comments (0)
About PowerShow.com