Blind Source Separation: Finding Needles in Haystacks - PowerPoint PPT Presentation

About This Presentation

Title:

Blind Source Separation: Finding Needles in Haystacks

Description:

Novel Techniques for Speech Enhancement Blind Source Separation: Find all the talker signals in the room - loud and soft, high and low-pitched, ... – PowerPoint PPT presentation

Number of Views:44

Avg rating:3.0/5.0

Slides: 41

Provided by: Sco145

Learn more at: https://ewh.ieee.org

Category:

more less

Transcript and Presenter's Notes

Title: Blind Source Separation: Finding Needles in Haystacks

1
Blind Source Separation Finding Needles in
Haystacks
Scott C. Douglas Department of Electrical
Engineering Southern Methodist University douglas_at_
lyle.smu.edu
2
Signal Mixtures are Everywhere

Cell Phones
Radio Astronomy
Brain Activity
Speech/Music

How do we make sense of it all?
3
Example Speech Enhancement
4
Example Wireless Signal Separation
5
Example Wireless Signal Separation
6
Example Wireless Signal Separation
7
Example Wireless Signal Separation
8
Outline of Talk

Blind Source Separation
General concepts and approaches
Convolutive Blind Source Separation
Application to multi-microphone speech recordings
Complex Blind Source Separation
What differentiates the complex-valued case
Conclusions

9
Blind Source Separation (BSS) -A Simple Math
Example
s(k)
x(k)
y(k)
A
B

Let s1(k), s2(k),, sm(k) be signals of interest
Measurements For 1 i m,
xi(k) ai1 s1(k) ai2 s2(k) aim sm(k)
Sensor noise is neglected
Dispersion (echo/reverberation) is absent

10
Blind Source Separation Example (continued)
s(k)
x(k)
y(k)
A
B

Can Show The si(k)s can be recovered as
yi(k) bi1 x1(k) bi2 x2(k) bim xm(k)
up to permutation and scaling factors (the
matrix B is like the inverse of matrix A)
Problem How do you find the demixing bijs
when you dont know the mixing aijs or sj(k)s?

11
Why Blind Source Separation?(Why not Traditional
Beamforming?)

BSS requires no knowledge of sensor geometry.
The system can be uncalibrated, with unmatched
sensors.
BSS does not need knowledge of source positions
relative to the sensor array.
BSS requires little to no knowledge of signal
types - can push decisions/ detections to the end
of the processing chain.

12
What Properties Are Necessary for BSS to Work?

Separation can be achieved when
( sensors) ( of sources)
The talker signals sj(t) are statistically-indep
endent of each other and
are non-Gaussian in amplitude
OR
have spectra that differ from each other
OR
are non-stationary
Statistical independence is the critical
assumption.

13
Entropy is the Key to Source Separation

Entropy A measure of regularity

In BSS, separated signals are demixed and, have
more order as a group.
First used in 1996 for speech separation.

14
Convolutive Blind Source Separation

Mixing system is dispersive

15
Goal of Convolutive BSS

Key idea For convolutive BSS, sources are
arbitrarily filtered and arbitrarily shuffled

16
Non-Gaussian-Based Blind Source Separation

Basic Goal Make the output signals look
non-Gaussian, because mixtures look more
Gaussian (from the Central Limit Theorem)
Criteria Based On This Goal
Density Modeling
Contrast Functions
Property Restoral e.g. (Non-)Constant Modulus
Algorithm
Implications
Separating capability of the criteria will be
similar
Implementation details (e.g. optimization
strategy) will yield performance differences

17
BSS for Convolutive Mixtures

Idea Translate separation task into frequency
domain and apply multiple independent
instantaneous BSS procedures
Does not work due to permutation problems
A Better Idea Reformulate separation tasks in
the context of multichannel filtering
Separation criterion stays in the time domain
no implied permutation problem
Can still employ fast convolution methods for
efficient implementation

18
Natural Gradient Convolutive BSS Alg.
Amari/Douglas/Cichocki/Yang 1997

where f(y) is a simple vector-valued
nonlinearity.
Criterion Density-based (Maximum Likelihood)
Complexity about four multiply/adds per tap

19
Blind Source Separation Toolbox

A MATLAB toolbox of robust source separation
algorithms for noisy convolutive mixtures
(developed under govt. contract)
Allows us to evaluate relationships and tradeoffs
between different approaches easily and rapidly
Used to determine when a particular algorithm or
approach is appropriate for a particular
(acoustic) measurement scenario

20
Speech Enhancement Methods

Classic (frequency selective) linear filtering
Only useful for the simplest of situations
Single-microphone spectral subtraction
Only useful if the signal is reasonably
well-separated to begin with ( gt 5dB SINR )
Tends to introduce musical artifacts
Research Focus How to leverage multiple
microphones to achieve robust signal enhancement
with minimal knowledge.

21
Novel Techniques for Speech Enhancement

Blind Source Separation Find all the talker
signals in the room - loud and soft, high and
low-pitched, near and far away without
knowledge of any of these characteristics.
Multi-Microphone Signal Enhancement Using only
the knowledge of target present or target
absent labels on the data, pull out the target
signal from the noisy background.

22
SMU Multimedia Systems LabAcoustic Facility

Room (Nominal Configuration)
Acoustically-treated
RT 300 ms
Non-parallel walls to prevent flutter echo
Sources
Loudspeakers playing Recordings as well as live
talkers.
Distance to mics 50 cm
Angles -30o, 0o, 27.5o
Sensors
Omnidirectional Micro- phones (AT803b)
Linear array (4cm spacing)

Data collection and processing entirely within
MATLAB.
Allows for careful characterization, fast
evaluation, and experimentation with artificial
and human talkers.

23
Blind Source Separation Example
Talker 1 (MG)
Convolutive Mixing (Room)
Separation System (Code)
Talker 2 (SCD)
Performance improvement Between 10 dB and 15 dB
for equal-level mixtures, and even higher for
unequal-level ones.
24
Unequal Power Scenario Results

Time-domain CBSS methods provide the greatest SIR
improvements for weak sources no significant
improvement in SIR if the initial SIR is already
large

25
Multi-Microphone Speech Enhancement
Noise Source
Contains most speech
y1
z1
y2
z2
Linear Processing
Noise Source
y3
z3
yn
zn
Contains most noise
Speech Source
Adaptive Algorithm
26
Speech Enhancement via Iterative Multichannel
Filtering

System output at time k a linear adaptive filter
is a sequence of (n x n) matrices
at iteration k.
Goal Adapt , over
time such that the multichannel output
contains signals with maximum speech energy in
the first output.

27
Multichannel Speech Enhancement Algorithm

A novel technique for enhancing target speech in
noise using two or more microphones via joint
decorrelation
Requires rough target identifier (i.e. when
talker speech is present)
Is adaptive to changing noise characteristics
Knowledge of source locations, microphone
positions, other characteristics not needed.
Details in Gupta and Douglas, IEEE Trans. Audio,
Speech, Lang. Proc., May 2009
Patent pending

28
Performance Evaluations
7
6
8
8
7
6

Room
Acoustically-treated, RT 300 ms
Non-parallel walls to prevent flutter echo
Sources
Loudspeakers playing BBC Recordings (Fs 8kHz),
1 male/1-2 noise sources
Distance to mics 1.3 m
Angles -30o, 0o, 27.5o
Sensors
Linear array adjustable (4cm spacing)

Room
Ordinary conference room (RT600ms)
Sources
Loudspeakers playing BBC Recordings (Fs 8kHz),
1 male/1-2 noise sources
Angles -15o, 15o, 30o
Sensors
Omnidirectional Microphones (AT803b)
Linear array adjustable (4cm nominal spacing)

28
29
Audio Examples

Acoustic Lab Initial SIR -10dB, 3-Mic System
Before After
Acoustic Lab Initial SIR 0dB, 2-Mic
SystemBefore After
Conference Room Initial SIR -10dB, 3-Mic
System
Before After
Conference Room Initial SIR 5dB, 2-Mic System
Before After

30
Effect of Noise Segment Length on Overall
Performance
31
Diffuse Noise Source Example

Noise Source SMU Campus-Wide Air Handling
System
Data was recorded using a simple two-channel
portable M-Audio recorder (16-bit, 48kHz) with it
associated T-shaped omnidirectional stereo
array at arms length, then downsampled to 8kHz.

31
32
Air Handler Data Processing

Step 1 Spatio-Temporal GEVD Processing on a
frame-by-frame basis with L 256, where Rv(k)
Ry(k-1) that is, data was whitened to the
previous frame.
Step 2 Least-squares multichannel linear
prediction was used to remove tones.
Step 3 Log-STSA spectral subtraction was applied
to the first output channel.

32
33
Complex Blind Source Separation
s(k)
x(k)
y(k)
A
B

Signal Model
x(k) A s(k)
Both the si(k)s in s(k) and the elements of
A are complex-valued.
Separating matrix B is complex-valued as well.
It appears that there is little difference from
the real-valued case

34
Complex Circular vs. Complex Non-Circular Sources

(Second-Order) Circular Source The energies of
the real and imaginary parts of si(k) are the
same.
(Second-Order) Non-Circular Source The energies
of the real and imaginary parts of si(k) are not
the same.

35
Why Complex Circularity Matters in Blind Source
Separation

Fact 1 It is possible to separate non-circular
sources by decorrelation alone if their
non-circularities differ Eriksson and Koivunen,
IEEE Trans. IT, 2006
Fact 2 The strong-uncorrelating transform is a
unique linear transformation for identifying
non-circular source subspaces using only
covariance matrices.
Fact 3 Knowledge of source non-circularity is
required to obtain the best performance of a
complex BSS procedure.

36
Complex Fixed Point Algorithm Douglas 2007

NOTE The MATLAB code involves both transposes
and Hermitian transposes and no, those arent
mistakes!

37
Performance Comparisons
38
Complex BSS Example
39
Conclusions

Blind Source Separation provides unique
capabilities for extracting useful signals from
multiple sensor measurements corrupted by noise.
Little to no knowledge of the sensor array
geometry, the source positions, or the source
statistics or characteristics is required.
Algorithm design can be tricky.
Opportunities for applications in speech
enhancement, wireless communications, other
areas.

40
For Further Reading