Title: Blind Source Separation: Finding Needles in Haystacks
1Blind Source Separation Finding Needles in
Haystacks
Scott C. Douglas Department of Electrical
Engineering Southern Methodist University douglas_at_
lyle.smu.edu
2Signal Mixtures are Everywhere
- Cell Phones
- Radio Astronomy
- Brain Activity
- Speech/Music
How do we make sense of it all?
3Example Speech Enhancement
4Example Wireless Signal Separation
5Example Wireless Signal Separation
6Example Wireless Signal Separation
7Example Wireless Signal Separation
8Outline of Talk
- Blind Source Separation
- General concepts and approaches
- Convolutive Blind Source Separation
- Application to multi-microphone speech recordings
- Complex Blind Source Separation
- What differentiates the complex-valued case
- Conclusions
9Blind Source Separation (BSS) -A Simple Math
Example
s(k)
x(k)
y(k)
A
B
- Let s1(k), s2(k),, sm(k) be signals of interest
- Measurements For 1 i m,
- xi(k) ai1 s1(k) ai2 s2(k) aim sm(k)
- Sensor noise is neglected
- Dispersion (echo/reverberation) is absent
10Blind Source Separation Example (continued)
s(k)
x(k)
y(k)
A
B
- Can Show The si(k)s can be recovered as
- yi(k) bi1 x1(k) bi2 x2(k) bim xm(k)
- up to permutation and scaling factors (the
- matrix B is like the inverse of matrix A)
- Problem How do you find the demixing bijs
- when you dont know the mixing aijs or sj(k)s?
11Why Blind Source Separation?(Why not Traditional
Beamforming?)
- BSS requires no knowledge of sensor geometry.
The system can be uncalibrated, with unmatched
sensors. - BSS does not need knowledge of source positions
relative to the sensor array. - BSS requires little to no knowledge of signal
types - can push decisions/ detections to the end
of the processing chain.
12What Properties Are Necessary for BSS to Work?
- Separation can be achieved when
- ( sensors) ( of sources)
- The talker signals sj(t) are statistically-indep
endent of each other and - are non-Gaussian in amplitude
- OR
- have spectra that differ from each other
- OR
- are non-stationary
- Statistical independence is the critical
assumption.
13Entropy is the Key to Source Separation
- Entropy A measure of regularity
- In BSS, separated signals are demixed and, have
more order as a group. - First used in 1996 for speech separation.
14Convolutive Blind Source Separation
- Mixing system is dispersive
15Goal of Convolutive BSS
- Key idea For convolutive BSS, sources are
arbitrarily filtered and arbitrarily shuffled
16Non-Gaussian-Based Blind Source Separation
- Basic Goal Make the output signals look
non-Gaussian, because mixtures look more
Gaussian (from the Central Limit Theorem) - Criteria Based On This Goal
- Density Modeling
- Contrast Functions
- Property Restoral e.g. (Non-)Constant Modulus
Algorithm - Implications
- Separating capability of the criteria will be
similar - Implementation details (e.g. optimization
strategy) will yield performance differences
17BSS for Convolutive Mixtures
- Idea Translate separation task into frequency
domain and apply multiple independent
instantaneous BSS procedures - Does not work due to permutation problems
- A Better Idea Reformulate separation tasks in
the context of multichannel filtering - Separation criterion stays in the time domain
no implied permutation problem - Can still employ fast convolution methods for
efficient implementation
18Natural Gradient Convolutive BSS Alg.
Amari/Douglas/Cichocki/Yang 1997
- where f(y) is a simple vector-valued
nonlinearity. - Criterion Density-based (Maximum Likelihood)
- Complexity about four multiply/adds per tap
19Blind Source Separation Toolbox
- A MATLAB toolbox of robust source separation
algorithms for noisy convolutive mixtures
(developed under govt. contract) - Allows us to evaluate relationships and tradeoffs
between different approaches easily and rapidly - Used to determine when a particular algorithm or
approach is appropriate for a particular
(acoustic) measurement scenario
20Speech Enhancement Methods
- Classic (frequency selective) linear filtering
- Only useful for the simplest of situations
- Single-microphone spectral subtraction
- Only useful if the signal is reasonably
well-separated to begin with ( gt 5dB SINR ) - Tends to introduce musical artifacts
- Research Focus How to leverage multiple
microphones to achieve robust signal enhancement
with minimal knowledge.
21Novel Techniques for Speech Enhancement
- Blind Source Separation Find all the talker
signals in the room - loud and soft, high and
low-pitched, near and far away without
knowledge of any of these characteristics. - Multi-Microphone Signal Enhancement Using only
the knowledge of target present or target
absent labels on the data, pull out the target
signal from the noisy background.
22SMU Multimedia Systems LabAcoustic Facility
- Room (Nominal Configuration)
- Acoustically-treated
- RT 300 ms
- Non-parallel walls to prevent flutter echo
- Sources
- Loudspeakers playing Recordings as well as live
talkers. - Distance to mics 50 cm
- Angles -30o, 0o, 27.5o
- Sensors
- Omnidirectional Micro- phones (AT803b)
- Linear array (4cm spacing)
- Data collection and processing entirely within
MATLAB. - Allows for careful characterization, fast
evaluation, and experimentation with artificial
and human talkers.
23Blind Source Separation Example
Talker 1 (MG)
Convolutive Mixing (Room)
Separation System (Code)
Talker 2 (SCD)
Performance improvement Between 10 dB and 15 dB
for equal-level mixtures, and even higher for
unequal-level ones.
24Unequal Power Scenario Results
- Time-domain CBSS methods provide the greatest SIR
improvements for weak sources no significant
improvement in SIR if the initial SIR is already
large
25Multi-Microphone Speech Enhancement
Noise Source
Contains most speech
y1
z1
y2
z2
Linear Processing
Noise Source
y3
z3
yn
zn
Contains most noise
Speech Source
Adaptive Algorithm
26Speech Enhancement via Iterative Multichannel
Filtering
- System output at time k a linear adaptive filter
- is a sequence of (n x n) matrices
at iteration k. - Goal Adapt , over
time such that the multichannel output
contains signals with maximum speech energy in
the first output.
27Multichannel Speech Enhancement Algorithm
- A novel technique for enhancing target speech in
noise using two or more microphones via joint
decorrelation - Requires rough target identifier (i.e. when
talker speech is present) - Is adaptive to changing noise characteristics
- Knowledge of source locations, microphone
positions, other characteristics not needed. - Details in Gupta and Douglas, IEEE Trans. Audio,
Speech, Lang. Proc., May 2009 - Patent pending
28Performance Evaluations
7
6
8
8
7
6
- Room
- Acoustically-treated, RT 300 ms
- Non-parallel walls to prevent flutter echo
- Sources
- Loudspeakers playing BBC Recordings (Fs 8kHz),
1 male/1-2 noise sources - Distance to mics 1.3 m
- Angles -30o, 0o, 27.5o
- Sensors
- Linear array adjustable (4cm spacing)
- Room
- Ordinary conference room (RT600ms)
- Sources
- Loudspeakers playing BBC Recordings (Fs 8kHz),
1 male/1-2 noise sources - Angles -15o, 15o, 30o
- Sensors
- Omnidirectional Microphones (AT803b)
- Linear array adjustable (4cm nominal spacing)
28
29Audio Examples
- Acoustic Lab Initial SIR -10dB, 3-Mic System
- Before After
- Acoustic Lab Initial SIR 0dB, 2-Mic
SystemBefore After - Conference Room Initial SIR -10dB, 3-Mic
System - Before After
- Conference Room Initial SIR 5dB, 2-Mic System
- Before After
30Effect of Noise Segment Length on Overall
Performance
31Diffuse Noise Source Example
- Noise Source SMU Campus-Wide Air Handling
System - Data was recorded using a simple two-channel
portable M-Audio recorder (16-bit, 48kHz) with it
associated T-shaped omnidirectional stereo
array at arms length, then downsampled to 8kHz.
31
32Air Handler Data Processing
- Step 1 Spatio-Temporal GEVD Processing on a
frame-by-frame basis with L 256, where Rv(k)
Ry(k-1) that is, data was whitened to the
previous frame. - Step 2 Least-squares multichannel linear
prediction was used to remove tones. - Step 3 Log-STSA spectral subtraction was applied
to the first output channel.
32
33Complex Blind Source Separation
s(k)
x(k)
y(k)
A
B
- Signal Model
- x(k) A s(k)
- Both the si(k)s in s(k) and the elements of
A are complex-valued. - Separating matrix B is complex-valued as well.
- It appears that there is little difference from
the real-valued case
34Complex Circular vs. Complex Non-Circular Sources
- (Second-Order) Circular Source The energies of
the real and imaginary parts of si(k) are the
same. - (Second-Order) Non-Circular Source The energies
of the real and imaginary parts of si(k) are not
the same.
35Why Complex Circularity Matters in Blind Source
Separation
- Fact 1 It is possible to separate non-circular
sources by decorrelation alone if their
non-circularities differ Eriksson and Koivunen,
IEEE Trans. IT, 2006 - Fact 2 The strong-uncorrelating transform is a
unique linear transformation for identifying
non-circular source subspaces using only
covariance matrices. - Fact 3 Knowledge of source non-circularity is
required to obtain the best performance of a
complex BSS procedure.
36Complex Fixed Point Algorithm Douglas 2007
- NOTE The MATLAB code involves both transposes
and Hermitian transposes and no, those arent
mistakes!
37Performance Comparisons
38Complex BSS Example
39Conclusions
- Blind Source Separation provides unique
capabilities for extracting useful signals from
multiple sensor measurements corrupted by noise. - Little to no knowledge of the sensor array
geometry, the source positions, or the source
statistics or characteristics is required. - Algorithm design can be tricky.
- Opportunities for applications in speech
enhancement, wireless communications, other
areas.
40For Further Reading
- My publications page at SMU
- http//lyle.smu.edu/douglas/puball.html
- It has available for download
- 82 of my published journal papers
- 75 of my published conference papers