Title: Short Time Fourier Analysis
1Short Time Fourier Analysis
2Introduction
- Frequently used in speech processing.
- The properties of speech signals change as a
function of time. - The basic idea is to analyze the signal in short
period of time using window function.
3Time-varying Fourier Transform (1)
4Time-Varying Fourier Transform (2)
IMPORTANTTime-varying Fourier transform is a
function of two variables.n the time index,
discrete the frequency variable
In this example, high frequency components
increase with time.
5Window Shape
The shape of the window sequence has an important
effect on the time-dependent Fourier transform.
We study rectangular window and Hamming window.
6Rectangular Window (N64)
w rectwin(64)wvtool(w)
Narrow main lobe ? greater frequency resolution ?
increased sharpnessLarge side lobe ? adjacent
harmonics interact ? ragged noisy spectrum
7Hamming Window (N64)
w hamming(64)wvtool(w)
A little bigger main lobe Much smaller side lobe
and leakage factor.Better than rectangular
window.
8Rectangular Window (Larger N)
N500. Obviously periodic. First formant peak at
300-400 Hz. Other peaks at 2200 Hz, 3800 Hz.
(From Rabiner and Schafer)
9Hamming Window (Larger N)
N500. More smooth. First formant peak at 300-400
Hz. Other peaks at 2200 Hz, 3800 Hz. (From
Rabiner and Schafer)
10Rectangular Window (n16)
w rectwin(16)wvtool(w)
The width of the main lobe is inversely
proportional to the length of the window.
11Hamming Window (n16)
w hamming(16)wvtool(w)
12Rectangular Window (Smaller N)
N50. No periodicity. Higher frequency
resolution. Broad peak at 400, 1400, and 2200 Hz.
(From Rabiner and Schafer)
13Hamming Window (Smaller N)
N50. No periodicity. Broad peak at 400, 1400,
and 2200 Hz. (From Rabiner and Schafer)
14Summary
The basic idea is to analyze the signal in short
period of time with a window function.
The purpose of the window is to limit the time
interval to be analyzed so that the properties of
the waveform do not change appreciably.
A good window function should have narrow main
lobe and smaller side lobe. The ideal window
function is an impulse.
Good temporal resolution requires a short window
while good frequency resolution calls for a long
window.
15Fourier Transform Interpretation (1)
is the normal Fourier transform of
Input signal can recovered exactly from the
time-varying Fourier Transform with the
requirement that w(0) is nonzero.
Prove
16Fourier Transform Interpretation (2)
Windowing Fourier Transform of the sequence x(m)
is convolved with the Fourier Transform of the
shifted window sequence.
The time dependent Fourier transform can be
interpreted as a smoothed version of the Fourier
transform of the part of the signal within the
window.
17Fourier Transform Interpretation (3)
Time Varying Spectral Display (TVSD)
Freq
Time
18Linear Filtering Interpretation (1)
For each value of
is the convolution
with the sequence
of the sequence
Filter with a low pass filter to see the change
at particular frequency as time goes.
19Linear Filtering Interpretation (2)
Time Varying Spectral Display (TVSD)
Freq
Time
20Speech Terminology
- Voiced speech Caused by excitation of periodic
sound source. - Unvoiced speech Aperiodic noise causes unvoiced
speech. - Formant Major resonance are called formants and
they appear as dark bands. Typical spacing of
formants is 1 kHz. The limited range of possible
bandwidths is 30-500 Hz. - Start start of voiced speech after a gap.
- Stop A short period of silence following voiced
speech.
21Speech Spectrum (Voice Print)
Unvoiced
Start
End
Voiced
Formant
22(No Transcript)
23(No Transcript)
24Overlapping Windows
Time Varying Spectral Display (TVSD)
?
50 overlapping
25Required Sampling Rate in Time Dimension
Suppose the effective bandwidth of the window is
B Hz. The sequence has the same
bandwidth as the window.According to the
sampling theorem, the sampling rate should at
least be 2B samples/second to avoid aliasing.
Example the approximate bandwidth of Hamming
window is B2Fs/L, where Fs is the sampling rate
of the original signal x(n), L is the window
width. Suppose Fs 10000Hz, L 100,B200Hz,
2B400times/second, i.e. every 25 samples.
Windows are overlap
26Required Sampling Rate in Frequency Dimension
The inverse Fourier transform of
is the signal and this signal is of duration L
(the width of the window) samples.
should be sampled at the set of frequencies.
Example for a Hamming window of duration L100,
is required to be evaluated
at least 100 uniformly spaced frequencies.
27Total Sampling Rate
SR2BL samples/sec
For most practical windows, B can be represented
as BCFs/L.
SR2C Fs samples/sec
2C indicates over-sampling ratio of the
short-time analysis.2C is usually between 2
(rectangular window) and 4 (Hamming window).
28Summary
The basic idea is to analyze the signal in short
period of time with a window function.
A good window function should have narrow main
lobe and smaller side lobe. The ideal window
function is an impulse.
Good temporal resolution requires a short window
while good frequency resolution calls for a long
window.
Overlapping windows in order to have good
temporal and frequency resolution.