Title: Digital Audio Signal Processing Lecture-2: Microphone Array Processing
1Digital Audio Signal Processing Lecture-2
Microphone Array Processing
- Marc Moonen Simon Doclo
- Dept. E.E./ESAT-STADIUS, KU Leuven
- marc.moonen_at_esat.kuleuven.be
- homes.esat.kuleuven.be/moonen/
2Overview
- Introduction beamforming basics
- Data model definitions
- Fixed beamforming
- Filter-and-sum beamformer design
- Matched filtering
- White noise gain maximization
- Ex Delay-and-sum beamforming
- Superdirective beamforming
- Directivity maximization
- Directional microphones (delay-and-subtract)
- Adaptive beamforming
- LCMV beamforming
- Frost beamforming
- Generalized sidelobe canceler
3Introduction
- A microphone is characterized by a directivity
pattern - which specifies the gain ( phase shift)
that the - microphone gives to a signal coming from
- a certain direction (angle-of-arrival)
- Directivity pattern is a function of frequency
(?) - In a 3D scenario angle-of-arrival
- is azimuth elevation angle
- Will consider only 2D scenarios for
- simplicity, with one angle-of arrival (?),
- hence directivity pattern is H(?,?)
- Directivity pattern is fixed and defined
- by physical microphone design
H(?,?) for 1 frequency
4Introduction
- By weighting or filtering (freq.dependent
weighting) and then summing signals from
different microphones, a (software controlled)
virtual directivity pattern (weigthed sum of
individual patterns) can be produced - This assumes all microphones receive the same
signals (so are all in the same positions).
- However...
N-tap FIR filters
5Introduction
- However, an additional aspect is that in a
microphone array different - microphones are in different
positions/locations, hence also receive - different signals
- Example uniform linear array
- microphones placed on a line
- uniform inter-micr. distances (d)
- ideal micr. characteristics (see p.8)
- For a far-field source signal (plane
- waveforms), each microphone
- receives the same signal, up
- to an angle-dependent delay
- fssampling rate
- cpropagation speed
- Beamforming spatial filtering based on
microphone characteristics (directivity patterns)
AND micr. array configuration (spatial
sampling).
6Introduction
- Background/history ideas borrowed from antenna
array design/processing for RADAR (later)
wireless communications. - Microphone array processing considerably more
difficult than antenna array processing - narrowband radio signals versus broadband audio
signals - far-field (plane wavefronts) versus near-field
(spherical wavefronts) - pure-delay environment versus multi-path
environment - Classification
- fixed beamforming data-independent, fixed
filters Fm e.g. delay-and-sum,
filter-and-sum - adaptive beamforming data-dependent adaptive
filters Fm e.g. LCMV-beamformer,
Generalized Sidelobe canceler - Applications voice controlled systems (e.g. Xbox
Kinect), speech communication systems, hearing
aids,
7Beamforming basics
- Data model source signal in far-field (see p.12
for near-field) - Microphone signals are filtered versions of
source signal S(?) at angle ? - Stack all microphone signals in a vector
-
-
- d is steering vector
- Output signal after filter-and-sum is
8Beamforming basics
- Data Model source signal in far-field
-
- If all microphones have the same directivity
pattern Ho(?,?), steering vector can be factored
as - Will often consider arrays with
- ideal omni-directional microphones
Ho(?,?)1 - Example uniform linear array, see p.5
9Beamforming basics
- Definitions (1)
- In a linear array (p.5) ? 90obroadside
direction - ? 0o
end-fire direction - Array directivity pattern (compare to p.3)
- transfer function for source at angle
? ( -plt?lt p ) - Steering direction
- angle ? with maximum amplification (for
1 freq.) - Beamwidth (BW)
- region around ?max with (e.g.)
amplification gt -3dB (for 1 freq.)
10Beamforming basics
- Data model source signal noise
- Microphone signals are corrupted by additive
noise -
- Define noise correlation matrix as
- Will assume noise field is homogeneous, i.e. all
diagonal elements of noise correlation matrix
are equal -
- Then noise coherence matrix is
-
11Beamforming basics
- Definitions (2)
- Array Gain improvement in SNR for source at
angle ? ( -plt?lt p ) - White Noise Gain array gain for spatially
uncorrelated noise (white) -
(e.g. sensor noise) -
ps often used as a measure for robustness - Directivity array gain for diffuse noise
(coming from all directions) -
- DI and WNG evaluated at ?max is often used
as a performance criterion
signal transfer function2
noise transfer function2
(ignore this formula)
12PS Near-field beamforming
- Far-field assumptions not valid for sources close
to microphone array - spherical wavefronts instead of planar waveforms
- include attenuation of signals
- 2 coordinates ?,r (position q) instead of 1
coordinate ? (in 2D case) - Different steering vector (e.g. with Hm(?,?)1
m1..M)
e
e1 (3D)2 (2D)
with q position of source pref
position of reference microphone pm
position of mth microphone
13PS Multipath propagation
- In a multipath scenario, acoustic waves are
reflected against walls, objects, etc.. - Every reflection may be treated as a separate
source (near-field or
far-field) - A more practical data model is
- with q position of source and Hm(?,q),
complete transfer function from source position
to m-the microphone (incl. micr. characteristic,
position, and multipath propagation) -
- Beamforming aspect vanishes here, see also
Lecture-3 - (multi-channel noise reduction)
14Overview
- Introduction beamforming basics
- Data model definitions
- Fixed beamforming
- Filter-and-sum beamformer design
- Matched filtering
- White noise gain maximization
- Ex Delay-and-sum beamforming
- Superdirective beamforming
- Directivity maximization
- Directional microphones (delay-and-subtract)
- Adaptive beamforming
- LCMV beamforming
- Frost beamforming
- Generalized sidelobe canceler
15Filter-and-sum beamformer design
- Basic procedure based on page 9
-
- Array directivity pattern to be matched to
given (desired) pattern - over frequency/angle range of interest
- Non-linear optimization for FIR filter design
(ignore phase response) - Quadratic optimization for FIR filter design
(co-design phase response) -
16Filter-and-sum beamformer design
- Quadratic optimization for FIR filter design
(continued) - With
- optimal solution is
-
Kronecker product
17Filter-and-sum beamformer design
M8 Logarithmic array N50 fs8 kHz
18 Matched filtering WNG maximization
- Basic procedure based on page 11
- Maximize White Noise Gain (WNG) for given
steering angle ? - A priori knowledge/assumptions
- angle-of-arrival ? of desired signal
corresponding steering vector - noise scenario white
19 Matched filtering WNG maximization
- Maximization in
- is equivalent to minimization of noise
output power (under - white input noise), subject to unit response
for steering angle () - Optimal solution (matched filter) is
- FIR approximation
-
20Matched filtering example Delay-and-sum
- Basic Microphone signals are delayed and then
summed together - Fractional delays implemented with truncated
interpolation filters (FIR) - Consider array with ideal omni-directional micrs
- Then array can be steered to angle ?
- Hence (for ideal omni-dir. micr.s) this is
matched filter solution
21Matched filtering example Delay-and-sum
ideal omni-dir. micr.s
- Array directivity pattern H(?,?)
-
destructive interference -
constructive interference - White noise gain
-
(independent of ?) -
- For ideal omni-dir. micr. array,
delay-and-sum beamformer provides - WNG equal to M for all freqs. (in the
direction of steering angle ?).
22Matched filtering example Delay-and-sum
ideal omni-dir. micr.s
- Array directivity pattern H(?,?) for uniform
linear array - H(?,?) has sinc-like shape and is
frequency-dependent -
M5 microphones d3 cm inter-microphone
distance ?60? steering angle fs16 kHz sampling
frequency
endfire
?60?
wavelength4cm
23Matched filtering example Delay-and-sum
ideal omni-dir. micr.s
- For an ambiguity,
called spatial aliasing, occurs. This is
analogous to time-domain aliasing where now the
spatial - sampling (d) is too large.
- Aliasing does not occur (for any ?) if
M5, ?60?, fs16 kHz, d8 cm
24Matched filtering example Delay-and-sum
ideal omni-dir. micr.s
- Beamwidth for a uniform linear array
-
- hence large dependence on microphones,
distance (compare p.22 23) and frequency (e.g.
BW infinitely large at DC) - Array topologies
- Uniformly spaced arrays
- Nested (logarithmic) arrays (small d for high ?,
large d for small ?) - 2D- (planar) / 3D-arrays
with e.g. ?1/sqrt(2) (-3 dB)
25 Super-directive beamforming DI maximization
- Basic procedure based on page
11 - Maximize Directivity (DI) for given steering
angle ? - A priori knowledge/assumptions
- angle-of-arrival ? of desired signal
corresponding steering vector - noise scenario diffuse
26 Super-directive beamforming DI maximization
- Maximization in
- is equivalent to minimization of noise
output power (under - diffuse input noise), subject to unit
response for steering angle () - Optimal solution is
- FIR approximation
-
27Super-directive beamforming DI maximization
ideal omni-dir. micr.s
- Directivity patterns for end-fire steering (?0)
- Superdirective beamformer has highest DI,
but very poor WNG - (at low frequencies, where diffuse noise
coherence matrix becomes ill-conditioned) - hence problems with robustness (e.g.
sensor noise) !
M 2
Maximum directivityM.M obtained for end-fire
steering and for frequency-gt0 (no proof)
28Differential microphones Delay-and-subtract
- First-order differential microphone directional
microphone - 2 closely spaced microphones, where one
microphone is delayed - (hardware) and whose outputs are then
subtracted from each other - Array directivity pattern
- First-order high-pass frequency dependence
- P(?) freq.independent (!) directional response
- 0 ? ?1 ? 1 P(?) is scaled cosine, shifted up
with ?1 - such that ?max 0o
(end-fire) and P(?max )1
?d/c ltlt?, ?? ltlt?
29Differential microphones Delay-and-subtract
- Types dipole, cardioid, hypercardioid,
supercardioid (HJ84)
Cardioid ?1 0.5 zero at 180o
DI 4.8 dB
Dipole ?1 0 (?0) zero at 90o
DI 4.8 dB
broadside
broadside
endfire
endfire
30Differential microphones Delay-and-subtract
Hypercardioid ?1 0.25
zero at 109o highest
DI6.0 dB
Supercardioid ?1
zero at 125o, DI5.7 dB
highest front-to-back ratio
endfire
endfire
31Overview
- Introduction beamforming basics
- Data model definitions
- Fixed beamforming
- Filter-and-sum beamformer design
- Matched filtering
- White noise gain maximization
- Ex Delay-and-sum beamforming
- Superdirective beamforming
- Directivity maximization
- Directional microphones (delay-and-subtract)
- Adaptive beamforming
- LCMV beamforming
- Frost beamforming
- Generalized sidelobe canceler
32LCMV-beamforming
- Adaptive filter-and-sum structure
- Aim is to minimize noise output power, while
maintaining a chosen response in a given look
direction (and/or other linear constraints, see
below). compare to () p.1926 - I.e. similar to operation of a superdirective
array (in diffuse noise), or delay-and-sum (in
white noise), but now noise field is unknown ! - Implemented as adaptive FIR filter (cfr DSP-II)
-
33LCMV-beamforming
- LCMV Linearly Constrained Minimum Variance
- f designed to minimize power (variance) of output
zk - To avoid desired signal cancellation, add (J)
linear constraints - Ex fix array response in look-direction ? for
sample freqs wi, i1..J () -
-
- With () (for sufficiently large J) constrained
output power minimization approximately
corresponds to constrained noise power
minimization (why?) - Solution is (obtained using Lagrange-multipliers,
etc..)
34Frost Beamforming
- Frost-beamformer adaptive version of
LCMV-beamformer - If Ryy is known, a gradient-descent procedure for
LCMV is - in each iteration filters f are updated in
the direction of the constrained gradient. The P
and B are such that fk1 statisfies the
constraints (verify!). The mu is a step size
parameter (to be tuned) - If Ryy is unknown, an instantaneous (stochastic)
approximation may be substituted, leading to a
constrained LMS-algorithm
35Generalized Sidelobe Canceler (GSC)
- GSC alternative adaptive filter formulation
of the LCMV-problem constrained optimisation
is reformulated as a constraint pre-processing,
followed by an unconstrained optimisation,
leading to a simpler adaptation scheme - LCMV-problem is
- Define blocking matrix Ca, ,with columns
spanning the null-space of C -
- Parametrize all fs that satisfy constraints
(verify!) -
- I.e. filter f can be decomposed in a fixed
part fq and a variable part Ca. fa - Unconstrained optimization of fa
- (MN-J coefficients)
36Generalized Sidelobe Canceler
- GSC (continued)
- Hence unconstrained optimization of fa can be
implemented as an - adaptive filter (adaptive linear combiner),
with filter inputs (left- - hand sides) equal to and desired
filter output (right-hand - side) equal to
- LMS algorithm
-
37Generalized Sidelobe Canceler
- GSC then consists of three parts
- Fixed beamformer (cfr. fq ), satisfying
constraints but not yet minimum variance),
creating speech reference - Blocking matrix (cfr. Ca), placing spatial nulls
in the direction of the speech source (at
sampling frequencies) (cfr. C.Ca0), creating
noise references - Multi-channel adaptive filter
- (linear combiner)
- your favourite one, e.g. LMS
-
38Generalized Sidelobe Canceler
- A popular GSC realization is as follows
- Note that some reorganization has been done
the blocking matrix now generates (typically)
M-1 (instead of MN-J) noise references, the
multichannel adaptive filter performs
FIR-filtering on each noise reference (instead of
merely scaling in the linear combiner).
Philosophy is the same, mathematics are different
(details on next slide).
39Generalized Sidelobe Canceler
- Math details (for Deltas0)
-
select sparse blocking matrix such that
input to multi-channel adaptive filter
use this as blocking matrix now
40Generalized Sidelobe Canceler
- Blocking matrix Ca
- Creating (M-1) independent noise references by
placing spatial nulls in look-direction - different possibilities (a la p.38)
- (broadside
steering)
Griffiths-Jim
- Problems of GSC
- impossible to reduce noise from look-direction
- reverberation effects cause signal leakage in
noise reference - adaptive filter should only be updated when
no speech is present ! -
Walsh