Title: Acoustic Localization by Interaural Level Difference
1Acoustic Localization by Interaural Level
Difference
2Acoustic Localization
Acoustic Localization Determining the location
of a sound source by comparing the signals
received by an array of microphones. Issues
reverberation noise
3Overview
- What is Interaural Level Difference (ILD)?
- ILD Formulation
- ILD Localization
- Simulation Results
- Conclusion and Future Work
4Techniques
- Interaural time difference (ITD) relative time
shift
- Interaural level difference (ILD) relative
energy level
All previous methods (TDE, beamforming, etc.) use
ITD alone.
5Previous Work
- Time Delay EstimationM. S. Brandstein, H. F.
Silverman, ICASSP 1997 P. Svaizer, M.
Matassoni, M. Omologo, ICASSP 1997 - BeamformingJ. L. Flanagan, J.D. Johnston, R.
Zahn, JASA 1985R. Duraiswami, D. Zotkin,
L.Davis, ICASSP 2001 - Accumulated CorrelationStanley T. Birchfield,
EUSIPCO 2004 - Microphone arrays Michael S. Brandstein,
Harvey F. Silverman, ICASSP 1995 P. Svaizer, M.
Matassoni, M. Omologo, ICASSP 1997 - Hilbert Envelope ApproachDavid R. Fischell,
Cecil H. Coker, ICASSP 1984
6A sneak peek at the results
Likelihood plots, Estimation error, Comparison of
different approaches
7 ILD Formulation
8 ILD Formulation
9ILD Formulation
10Isocontours for 10log(delta E)
11ILD Localization
Why multiple microphone pairs?
- With only two microphones source is constrained
to lie on a curve - The microphones cannot pinpoint the sound source
location - We use multiple microphone pairs
- The intersection of the curves yield the sound
source location
12Combined Likelihood Approach
Localize sound source by computing likelihood at
a number of candidate locations
- Define the energy ratio as
- Then the estimate for the energy ratio at
candidate location is
where is the location of the ith
microphone
- is treated as a Gaussian
- random variable
- Joint probability from multiple microphone
- pairs is computed by combining the
- individual log likelihoods
13Hilbert Transform
- The Hilbert transform returns a complex sequence,
from a real data sequence. - The complex signal x xr ixi has a real part,
xr, which is the original data, and an imaginary
part, xi, which contains the Hilbert transform. - The imaginary part is a version of the original
real sequence with a 90 phase shift. - Sines are therefore transformed to cosines and
vice versa.
14Hilbert Transformer
In Frequency domain, Xi(ejw) H(ejw)Xr(ejw)
The Hilbert transformed series has the same
amplitude and frequency content as the original
real data and includes phase information that
depends on the phase of the original data.
15Hilbert Envelope Approach
- All-pass filter circuit produces two signals with
equal amplitude but 90 degrees out of phase. - Square root of the sum of squares is taken.
16Simulated Room
17Simulation Results
- The algorithm
- Accurately estimates the angle to the sound
source in some scenarios - Exhibits bias toward far locations (unable
to reliably estimate the distance to the
sound source) - Is sensitive to noise and reverberation
18Results of delta E Estimation
- The estimation is highly dependent upon the
- sound source location
- amount of reverberation
- amount of noise
- size of the room
- relative positions of source and microphones
19Likelihood plots
20Likelihood plots
21Likelihood plots
22Likelihood plots
23Likelihood plots
24Angle Errors in a 5x5 m room
25Angle error in degrees for the 5x5 m room when
the source is at a distance of 1m
Angle error in degrees for the 10x10 m room when
the source is at a distance of 1m
26Angle error in degrees for the 5x5 m room when
the source is at a distance of 2m
Angle error in degrees for the 10x10 m room when
the source is at a distance of 2m
27Comparison of errors with the Hilbert Envelope
Approach in a 5x5 m room
28Comparison of errors with the Hilbert Envelope
Approach in a 10x10 m room
20 dB
10 dB
0 dB
Without Hilbert solid line, blue Matlab
Hilbert dotted line, red Kaiser
Hilbert dashed line, green
0.7
Reflection coefficient
0.8
0.9
29Likelihood plots without Hilbert Envelope
30Frames approach
5x5 m room, theta 18 deg , SNR 0db,
reflection coefficient 9, d 2m (left), d 1m
(right)
Mean error 15 deg Std Dev 11 deg
Mean error 7 deg Std Dev 6 deg
- Signal divided into 50 frames
- Frame size 92.8ms
- 50 overlap in each frame
31Conclusion and Future Work
- ILD is an important cue for acoustic localization
- Preliminary results indicate potential for ILD
(Algorithm yields accurate results for several
configurations, even with noise and
reverberation) - Future work
- Investigate issues (e.g., bias toward distant
locations, sensitivity to reverberation) - Experiment in real environments
- Investigate ILDs in the case of occlusion
- Combine with ITD to yield more robust results
32Thank You