Title: Introduction to Psychoacoustics
1Introduction to Psychoacoustics
- By Carolina Figueroa
- University of Idaho
- Advanced Human Factors
2- Spatial sound is 3D (like vision)
- Azimuth cues left to right
- difference in the times at which sound waves
arrives at the 2 ears - Elevation cues up or down
- spectral changes produced by outer ear (or
pinnate) - Distance cues near or far
- Applications
- computer games,
- aids for vision impaired,
- virtual reality systems,
- eyes-free displays for pilots and air-traffic
controllers, - spatial audio for teleconferencing and shared
electronic workspace
3- Generating 3D sounds
- 3D sound loudspeakers at many different
positions (expensive). - 2 ears 2 channels, it is possible generate 3D
with 2 channels (binaural approach using HRTFs) - HRTF Head-Related Transfer Functions
- Function of the location of the source relative
to listener - Captures spectral changes caused by torso, head,
outer ears (physical size and shape of listener)
when a sound wave propagates from sound source to
listeners ears. These changes depend on the
azimuth, elevation and range (distance) from
listener to source. - Sound signal filtered by accurate HRTF and sent
to the 2 ears of listener (headphones) is
experienced as 3D sound
4- Major factors that influence spatial hearing
- Coordinate system
- Azimuth cues
- Elevation cues
- Range cues
- Reverberation and Echoes
5XY horizontal XZ frontal YZ median
Spherical coordinates Azimuth angle over from
median plane (right, left) Elevation angle from
horizontal plane to source and X axis
(up,down) Range (near, far)
6Duplex Theory. There are two primary cues for
azimuth Interaural Time Difference (ITD) and
Interaural Level Difference (ILD). ITD a/c (?
sin ?) /- 90 deg ITD max (source off to one
side) a/c (3.14/2 1) ITD 0 sound is directly
ahead a distance source c sound speed 343
m/s ILD (frequency dependent) ILD low (1.5 kHz)
almost no difference in sound presure at 2
ears ILD highv (gt 1.5 kHz) 20dB or greater
difference head-shadow effect
- Azimuth cues (binaural cues)
0.7 ms difference between time to arrive to right
ear and left for sound in the horizontal plane
7Azimuth
Altitude (elevation)
8Outer ear or pinna acts acoustic antenna Resonant
cavities amplify some frequencies and geometry
attenuate others Its response is directionally
dependent 2 paths from source to ear canal (a
direct and longer path) Low freq. collects
additional sound energy and 2 paths arrive in
phase High freq. long path is out of phase with
the direct signal and destructive interference
occurs Greatest interferance path length is
half wavelength producing a pinna
notch Pronounce pinna notch for sources
ABOVE Path differences changes with elevation,
pinna notch moves with elevation
- Elevation cues (monoaural)
Pinna shape varies from person to person causing
shifts in frequencies, harder control elevation
(individual HRFTs)
9- Range cues
- Humans are best estimating azimuth, next
elevation and worst estimating range. - Loudness
- Sound energy coming from source falls inversely
with the square of range. - No 1-to-1 relationship between emitted and
received sound energy - Playing a soft sound does not mean is far
- Need to know characteristics of source
- Motion parallax
- Translation of head, the change azimuth depended
on range. - Close sources, small shift, large change in
azimuth distant sources, almost no shift, no
azimuth change - Excess interaural level difference
- ILD increases when very close sources to the head
- Extreme case insect buzzing in one ear
- Ratio of direct to reverberant sound (major cue)
- Ordinary rooms sound is reflected and scattered
from environmental surfaces - At close ranges ratio is large, long rages ratio
is small
10- Reverberation and Echoes
- Reflections time delay 30-50 ms echoes
- Anechoic chambers absorb sound energy, only
radiated energy reaches the ears - Reflections do not interfere with ability to
localize sources because we adapt quickly
11- Spatial audio systems
- 2-channel (stereo)
- place a sound on the left, send its signal to
the left loudspeaker, to place it on the right,
send its signal to the right loudspeaker - Multi-channel (surround)
- separate channels for every desired direction,
including above and below Binaural recordings - Expensive and unlikely to play role in HCI
12- Binaural recordings
- recreate the sound pressures at the right and
left ear drums that would exist if the listener
were actually present - Several disadvantages
- use of headphones
- not interactive, but must be prerecorded
- If the listener moves, so do the sounds
- Sources that are directly in front usually seem
to be much too close - pinna shapes differ from person to person,
elevation effects are not reliable - Improvements using HRTFs
13- Head-Related Transfer Functions (HRTF)
- Fourier transform of the head-related impulse
response sound pressure from the source to the
ear drum - HRTF captures all physical clues to source
localization - Once HRTF for right and once for left ears
- HRTF 4 variables 3 space coordinates and
frequency - Most HRTF measurements are made in far field,
HRTF falls inversely with range - Far field for sources at distances greater than
1 meter because reduces HRTF to a function of
azimuth, elevation and frequency - HRTF are measured in anechoic settings (dont
include effects of environmental sound
reflections) - Binaural room simulator to introduce important
reflections so when using headphones you wont
hear sounds very close to or inside or the head
14- HRTF-Based systems
- Able to produce elevation, range and azimuth
effects - Person-to-person differences and computational
limitations, it is much easier to control azimuth
than elevation or range - Headphones uncomfortable, have to be compensated
from resemble pinna responses, compensation is
sensitive to position. - Loudspeakers problems with low frequencies,
distance between speakers - Head tracking recalculate relative position
(location and orientation of the head) of each
source modifying HRIRs too, is expensive, not
very reliable, latency, unwanted transients - Measured vs. Modeled HRTFs standards, set of
standards (group of people), individualized or
use a model HRTF (rational function, series
expansions, structural models)
15- Facts
- Best sources 90deg in horizontal plane, directed
to right ear - Weakest source at 270 deg or -90 deg on the
opposite side of the head - Front/back (0 and 180 deg) are similar