Title: Frequency response adaptation in binaural hearing
1Frequency response adaptation in binaural hearing
- David Griesinger
- Cambridge MA USA
- www.DavidGriesinger.com
2Introduction
- This paper proposes fundamental questions about
the properties of human hearing (the topic of
this conference) - How do we localize sounds in the up/down and
front/back planes? - Are the methods used different for different
individuals? - Can binaural recordings made for one individual
be made to work for another individual without
head-tracking? - Given the extremely non-uniform transfer of sound
pressure from the soundfield to a human eardrum,
how can we accurately perceive a frequency
balance as flat - Does a frequency balanced pink noise from a
frontal loudspeaker sound balanced in frequency? - If not, are commercial recordings, which are
equalized using loudspeakers, actually frequency
balanced? - If not in what ways are they biased?
3Better binaural technology
To answer these questions the author constructed
an accurate physical model of his own hearing,
all the way to the eardrum. The pinna compliance
is modeled by cutting away the inside of the
casting. The eardrum impedance is modeled with a
resistance tube. The ear canal is an accurate
silicon cast all the way to the eardrum.
Tiny probe microphones were also built with a
very soft tip. This allows binaural recording of
performances at the authors eardrums, correct
headphone calibration, and verification of the
accuracy of the dummy head model.
4A perplexing Discrepancy
- Recordings made with this technology provide
excellent localization accuracy. - But at least initially the timbre of the playback
through carefully calibrated headphones seems
incorrect. - The frequencies around 3kHz seem too strong, and
the bass is usually weaker than my memory of the
performance. - Checking and re-checking the calibrations has
convinced me the recordings and the playback are
correct. - It is my memory of the performance that is
flawed. - The most reasonable explanation is that we
continuously adapt to the frequency balance of
sounds around us. We remember the timbre after
such adaptation has taken place.
5A simple model of human hearing
Over a long period of time the brain builds
spectral maps for the features that define
up/down and back/front information in HRTFs.
When a sound is heard these features are compared
to the maps, and a localization is found.
6A simple model of human hearing-2
When a match has been found, the perceptible
features of the particular HRTF are removed,
again from a fixed spectral map. But this
spectrum is altered by a relatively short time
constant adaptive equalizer, with acts to make
all frequency bands equally perceived. The time
constant of this mechanism for the author is
about 5 minutes. It may be shorter for some
individuals.
7An example
- The author once noticed a gliding whistle while
walking under an overhead ventilator slot that
emitted broadband noise. - Walking rapidly (3.5mph) under that noise source
produced a gliding whistle, somewhat like a
Doppler shift. - This is the uncorrected sound of the vertical
HRTFs - In spite of the lack of timbre correction the
sound was correctly localized even at much
higher speeds. - No timbre shift was perceived when walking slowly
under the slot ( - When there is sufficient time our brains correct
the timbre but this correction takes time in
this case a fraction of a second. -
8Headphone listening
When we listen to binaural recordings with
headphones the whole process is broken.
Headphones match individuals very poorly (as we
will see). None of the spectral features match
the fixed HRTF maps. The brain is confused, and
the subject perceives the sound inside the
head. But the adaptive equalizer is still active
and after a time period the sound is perceived
as frequency balanced.
9Consequences of adaptation for sound engineers.
- Tonmeisters talk about being familiar with a
particular loudspeaker or studio. - They claim they can make an accurately balanced
recording with these tools. - A logical conclusion is that the timbre of
loudspeakers or playback equipment is irrelevant. - As long as you are familiar with it everything
is fine. - But the conclusion is clearly false.
- A recent book by Floyd Toole details the changes
in the frequency content of popular records as
fashion in monitor loudspeakers changed. - All sound reinforcement engineers are aware of
how much intelligibility can increase when a
sound system is equalized. This typically
involves a treble boost above 1000Hz. - Absolute frequency balance matters.
10Upward Masking
Sound enters basilar membrane at the oval window.
High frequencies excite the membrane near the
entrance, passing through it and exiting through
the second window below. Low frequencies travel
further down the spiral, until they excite the
membrane and pass through. Strong low frequencies
disturb the high frequency portion of the
membrane, causing the well know phenomenon of
upward masking.
Upward masking is a purely mechanical effect, and
it cannot be compensated by adaptive
equalization. The high frequencies are simply
not detected. Intelligibility is frequently low
in acoustic spaces because there is little low
frequency absorption, and the LF acoustic power
is boosted. We adapt to the frequency imbalance,
and say the sound is OK but unintelligible
11Upward masking and mixing
- A consequence of upward masking is that elements
in a mix that are audible in one studio or set of
loudspeakers may be masked in another. - Recordings mixed over headphones can be seriously
in error. - Most headphones boost the treble, raising the
apparent clarity. - As an engineer I learned early to mistrust the
balance between direct and reverberation over
headphones - The best I could do was make the recording much
dryer than I, or my clients, preferred and hope
for the best. - One an always make the recording more reverberant
- Making it dryer is much more difficult!
- Can we find a way to correct headphone errors?
12Accurate binaural recordings
If safe, comfortable probe microphones are
available, it is possible to make accurate
binaural recordings. First we measure the
headphone response at the eardrum response H.
We can then record with the same probe
microphones. If we equalize the recording with
the inverse of H, H, the recording will play
back with perfect fidelity.
13Playback of binaural over speakers
If we want to play back the binaural recording
over speakers, or if we want to play loudspeaker
music over headphones, we need to measure the
spectrum of a carefully equalized loudspeaker at
the eardrums of the listener. This is the
spectrum S. We then equalize the binaural
recording with S, and we can play it over
speakers. Equalizing the phones with HS allows
playback of both binaural and loudspeaker mixed
music. HS is the inverse of the free-field
earphone response
14Binaural equalization in practice
- Note the two previous slides made no attempt to
equalize the probe microphone(s). - With those schemes, the response of the probe
cancels in the final result. - In practice, the probe response is complicated
and difficult to invert. - The author carefully measures the impulse
response of the probes with a BK 4133 as a
reference. - The responses are inverted in the frequency
domain with Matlab. With care minimal pre-echo
is produced. - All measurements with the probes are first
convolved with this inverse function. - Second order parametric filters are combined to
produce the other equalization filters. - Parametric filters can be easily inverted, and
sound better than mathematical inverse filters to
the author
15Probe Equalization
This graph shows the frequency response and time
response of the digital inverse of the two probes
as measured against a BK 4133 microphone. Matlab
is used to construct the precise digital inverse
of the probe response, both in frequency and in
time. The resulting probe response is flat from
25Hz to 17kHz. In general, I prefer NOT to use
a mathematical inverse response, as these
frequently contain audible artifacts. I
minimized these artifacts here by carefully
truncating the measured response as a function of
frequency.
16Adaptive Timbre how do we perceive pink noise
as flat
- Pink noise sounds plausibly pink even on this
sound system. - Lets add a single reflection and listen for a
few minutes without other sounds - The result at first sounds colored, with an
identifiable pitch component. - The pitch component gradually reduces its
loudness. - But now play the unaltered noise again.
- The unaltered noise now has a pitch,
complementary to the pitch from the reflection.
17Some demos of eardrum recordings
- These recordings have been equalized for
loudspeaker reproduction. You may be able to
judge clarity and intelligibility over near-field
loudspeakers. - Accurate headphone reproduction requires
headphone equalization - If probes are available the method described here
will work, - A method which uses equal loudness curves will be
described later in this paper. - opera balcony 2, seat 11
- Moderate intelligibility, reverberant sound
- opera balcony 3, seat 12
- Poor intelligibility, very reverberant
- opera standing room
- Deep under balcony 2 good intelligibility
- A concert hall row 8 (quite close)
- Very good sound. Not so good further back.
18The need for eardrum measurements
- Almost all current binaural research uses HRTF
and headphones with a blocked or partially
blocked ear canal. - There is an assumption (without proof) that such
measurements accurately reproduce the sound
pressure at the eardrum. - The assumption is blatantly false. To quote
Hammershoi and Moller - The most immediate observation is that the
variation in sound transmission from the
entrance of the ear canal to the eardrum from
subject to subject is rather highThe presence of
individual differences has the consequence that
for a certain frequency the transmission differs
as much as 20dB between subjects. - 20dB is a significant difference in response!
- In spite of the data, Hammershoi and Muller
recommend using measurements at the entrance to
the ear canal! - The recommendation can be disproved by a single
subject
19HRTFs from blocked ear canals
Here are pictures of a partially blocked canal
and a fully blocked canal. The following data
applies to the fully blocked measurements, but
the partially blocked measurements are similar.
20Blocked measurements vs eardrum
- To compare the two measurement methods, I
equalize the blocked measurement of a single HRTF
to the same HRTF measured at the eardrum. I
chose the HRTF at azimuth 15 degrees left, and 0
degrees elevation. - The needed equalization requires at least 3
parametric sections. - Red is the right ear, blue is the left ear
21HRTF differences blocked to eardrum
Twenty different HRTFs were measured with a
blocked canal, equalized by the above EQ, and the
difference between them and the open ear canal
are plotted. This data supports Hammershoi and
Mullers contention that that the directional
properties of the measured HRTFs are preserved by
the blocked measurement, at least to a frequency
of 7kHz. Note the vertical scale is -30dB. The
errors at 7-10k are significant.
22Headphone response differences
Using the same method, I measured three
headphones. Blue is the AKG 701, red is the AKG
240, and Cyan is the Sennheiser 250 The curves
plot the difference between the blocked and
unblocked measurement, with the measured HRTF at
azimuth 15, elevation 0 as a reference. The
vertical scale is -30dB. Errors of at least
10dB exist at midband.
23More headphones
Blue and old but excellent noise protection
earphone by Sharp. Red Ipod earbuds. The
error in the blocked measurements are large
enough to prevent accurate localization of
binaural recordings.
24Analysis
- The previous curves are NOT the frequency
response of the headphones under test. They show
the ERRORs that occur when a blocked ear canal
measurement is used instead of the eardrum
pressure. - Because the scale of the plots is -30dB the
difference curves look better than they really
are. Errors of 10dB in frequency ranges vital
for timbre are present for almost all the
examples shown. - We can conclude that it is possible to use
recordings from dummy heads that lack accurate
ear canals IF AND ONLY IF it is possible to
equalize them, either by comparison to a
reference with ear canals, or by equalizing them
to sort-of flat for a frontal sound source. If
this is done, we must also equalize the
headphones at the eardrum for the same source. - We can with more assurance conclude that it is
NOT possible to equalize headphones with a
measurement system that does NOT include an
accurate ear canal model. - Both KEMAR and HATS do not qualify.
- Measurement systems with true ear canals are a
very good thing - In addition I have found that for many earphones
it is vital to have a pinna model with identical
compliance to a human ear. - Particularly on-ear headphones alter the concha
volume and drastic changes in the frequency
response can result if the compliance is not
accurate. - Pinna are complex structures with variable
compliance so this is tricky!
25Headphone calibration through equal loudness
contours
- There is a non-invasive method of headphone
calibration to an individual. - IEC publication 268-7 and German Standard DIN
45-619 recommend loudness comparison using 1/3
octave noise instead of physical measurement for
headphones. - These recommendations were superseded by diffuse
field measurements as suggested by Theile. - Should these methods be revived? I believe the
answer is yes.
26Equal Loudness
Top ISO equal loudness curves for 80dB and 60dB
SPL these are the average from many individuals,
so features in them are broadened. Bottom
(blue/red) averaged frontal response over a -5
degree cone in front of the author, measured at
the eardrums. The loudspeaker was equalized to
200Hz. Bottom - black/cyan the same
measurement for the authors dummy head with no
equalization. The difference in eardrum impedance
above 8kHz boosts the response of the dummy but
this can be removed by equalization.
27Equal Loudness 2
- We can measure equal loudness curves because the
ear does not adapt when the stimulus is narrow
band either noise or tone. - The differences between the top and bottom curves
in the previous slide can be attributed to the
properties of the middle ear and the inner ear. - Thus equal loudness curves are a method of
measuring the effective frequency response an
individuals hearing system in the absence of
short-term adaptation to the environment. - They represent our sensitivity to timbre in a
quiet environment, or before adaptation takes
place. - Their extreme lack of flatness is proof of the
existence, and effectiveness, of adaptation.
28Loudness matching experiments
- The author wrote a Windows program that presents
a subject with alternating bands of 1/3 octave
noise, one at 500Hz, and the other at a test
frequency - The subject matches the loudness of the two bands
by adjusting the test band up and down. - In use, the equal loudness curves from 500Hz to
12kHz for a carefully equalized frontal
loudspeaker are obtained for this subject. - The subject then repeats the experiment with a
pair of headphones over a frequency range of 30Hz
to 12kHz. - In this case the balance between the two ears is
also tested and corrected. - The difference of the loudspeaker and headphone
measurements becomes the ideal headphone
correction for this individual. - This program can be used to test the variation in
response of a particular headphone over a wide
range of individuals. - Subjects report that the resulting equalization
is very pleasant, and binaural recordings made
with the authors ears reproduce well without
head tracking. - Music recorded for loudspeakers is judged
identical in timbre in both the headphones and
the loudspeaker. - The equalization is also identical in timbre to a
large high-quality stereo sound system.
29Results for 10 individuals
About 10 students from Helsinki University
participated in the test. The top left graph
shows the equal loudness contours from the
loudspeaker for each subject. The other curves
show the difference between this curve and the
equal loudness curves for four different
headphones. It was hoped that the Stax 303 phones
would show less individual variation. This was
not the case. (blue left ear, red right cyan
authors left ear)
The Philips phones were an insert type. These
also showed large variation among individuals.
30The dip at 3kHz for all subjects
- All subjects show a dip in the loudspeaker equal
loudness curve at 3kHz. - This corresponds to a universal peak in the
response of the concha and ear canal at this
frequency. - It is this ear sensitivity peak that causes the
most trouble with our memory of timbre. - When we first play an accurately calibrated
binaural recording particularly of a speaking
voice or a chorus this peak in the loudness is
highly noticeable and unpleasant. - Once we adapt, everything is OK again.
31Comments on these results.
- The experiment is equivalent to equalizing
headphones for a frontal, free-field response. - This is at variance with the current standard for
diffuse field equalization. - In the authors experience the free field
equalization is far more useful than the diffuse
field equalization, and gives better results on
loudspeaker recorded music. - These recordings are intended to be heard in a
room where the direct sound is frontal, and
dominant. - After doing the experiment the subjects were
given the opportunity to listen to music both
with the frontal equalization and with their own
equal loudness equalization. (the speaker curves
were not subtracted) - The authors binaural recordings were perceived
with better localization with the free-field
equalization. (These recordings were equalized
for free-field reproduction.) - Many subjects preferred their own equal loudness
equalization for other material. - This equalization requires no adaptation to a
recording that has an accurately flat frequency
response. - The sound can be quite seductive.
32Some Speculation
- Equal loudness curves have two prominent
features the increase insensitivity around 3kHz,
and the decrease in sensitivity at low
frequencies. - Music that has been recorded with frequency
linear microphones and not post-processed often
seems lacking in bass and harsh in the midrange
both on loudspeakers and on eardrum-equalized
headphones. - The author speculates that an unconscious
collusion between loudspeaker designers and
recording engineers routinely boosts the bass,
and tweaks the 3kHz region on commonly available
recordings. - It is common to boost the bass 10dB at 60Hz in
automobiles. - Floyd Tooles findings that the loudspeakers that
are closest to frequency linear are preferred in
blind listening tests may be biased by the choice
of recordings used in the tests. - The spectrum of choral music in the authors
unprocessed recordings shows a 3dB peak around
3kHz. - This peak is generally absent in vocalists on pop
music. Perhaps they use a different singing
technique and perhaps the equalization has been
adjusted closer to an equal-loudness curve.
33Conclusions
- Experiments and observation suggest that human
hearing uses a combination of fixed spectral maps
to perceive the localization of a sound, and then
corrects the HRTF timbre with a similar map. - These fixed maps are combined with a relatively
rapid AGC system that tends to equalize loudness
across frequency bands. - The existence of equal loudness curves show that
for narrow band signals adaptation does not take
place. When a new, unknown broadband signal is
first heard, the ear hears the timbre that
reflects the equal loudness calibration. But
this timbre is replaced in a short time with a
more balanced timbre, and this balanced timbre is
remembered. - It is likely that given the opportunity to
equalize a recording to their own taste using
loudspeakers with a flat frequency response,
recording engineers will be sorely tempted to
move toward their own equal loudness curve. - The temptation is dangerous but probably
harmless. We can see that individual loudness
curves can be rather different particularly at
low frequencies. - But adaptation will continue to work when the
recording is played back, and if the response
does not match that of the listener, they will
soon not notice the difference.