Interactive Audio - PowerPoint PPT Presentation

1 / 102
About This Presentation
Title:

Interactive Audio

Description:

Audio is acoustic, mechanical, or electrical frequencies corresponding to ... Audio can be digitized and processed on a computer ... Effect of Audio on Visual Quality ... – PowerPoint PPT presentation

Number of Views:221
Avg rating:3.0/5.0
Slides: 103
Provided by: GVU
Category:

less

Transcript and Presenter's Notes

Title: Interactive Audio


1
Interactive Audio
  • Sound, Waves, the Ear
  • 3D audio

2
Overview
  • Fundamentals of Sound
  • Psychoacoustics
  • Interactive Audio
  • Applications

3
What is sound?
  • Sound is the sensation perceived by the sense of
    hearing
  • Audio is acoustic, mechanical, or electrical
    frequencies corresponding to normally audible
    sound waves

4
Dual Nature of Sound
  • Transfer of sound and physical stimulation of ear
  • Physiological and psychological processing in ear
    and brain (psychoacoustics)

5
Transmission of Sound
  • Requires a medium with elasticity and inertia
    (air, water, steel, etc.)
  • Movements of air molecules result in the
    propagation of a sound wave

6
Particle Motion
7
Longitudinal Motion of Air
8
Wavefronts and Rays
9
Reflection of Sound
10
Absorption of Sound
  • Some materials readily absorb the energy of a
    sound wave
  • Example carpet, curtains at a movie theater

11
Refraction of Sound
12
Refraction of Sound
13
Diffusion of Sound
  • Not analogous to diffusion of light
  • Naturally occurring diffusions of sounds
    typically affect only a small subset of audible
    frequencies
  • Nearly full diffusion of sound requires a
    reflection phase grating (Schroeder Diffuser)

14
The Inverse-Square Law (Attenuation)
I is the sound intensity in W/cm2 W is the sound
power of the source in W r is the distance from
the source in cm
15
Psychoacoustics
  • Physiological Interactions with audio
  • Psychological processing

16
Ear Anatomy
17
Idealized Ear
18
Mechanical Model of Middle Ear
19
The Skull
  • Occludes wavelengths small relative to the
    skull
  • Causes diffraction around the head (helps amplify
    sounds)
  • Wavelengths much larger than the skull are not
    affected (explains how low frequencies are not
    directional)

20
The Pinna
21
The Pinna
  • Directs sound into the ear
  • Provide cues which indicate sound direction

22
Importance of the Pinna
23
Ear Canal
  • 0.7cm diam. And 3cm long
  • Amplifies sound at quarter wavelength resonant
    frequency (3kHz)

24
Ear Canal and Skull
  • (A) Dark line ear canal only
  • (B) Dashed line ear canal and skull diffraction

25
Middle Ear
  • Eardrum vibrates from sound pressure changes
  • Ossicles transfer vibration to the oval window
  • Impedance difference of air and inner ear fluid
    is matched by ratio of surface area of eardrum
    and surface area of oval window

26
Inner Ear
27
The Cochlea
  • Mechanical-to-electrical transducer
  • Frequency-selective analyzer
  • Tectorial and Basilar membranes rub together to
    stimulate Hair Cells

28
Place Theory
29
Place Theory
  • The position of maximum vibration of the basilar
    membrane corresponds to the perceived pitch of
    pure tones
  • Each hair cell and each nerve fiber has very
    sharp bandpass characteristics

30
Auditory Area (20Hz-20kHz)
31
Spatial Hearing
  • Ability to determine direction and distance from
    a sound source
  • Not fully understood process
  • However, some cues have been identified as useful

32
The Duplex Theory of Localization
  • Interaural Intensity Differences (IIDs)
  • Interaural Arrival-Time Differences (ITDs)

33
Interaural Intensity Difference
  • The skull produces a sound shadow
  • Intensity difference results from one ear being
    shadowed and the other not
  • The IID does not apply to frequencies below
    1000Hz (waves similar or larger than size of
    head)
  • Sound shadowing can result in up to 20dB drops
    for frequencies 6000Hz
  • The Inverse-Square Law can also effect intensity

34
Interaural Intensity Difference
35
Interaural Arrival-Time Difference
  • Perception of phase difference between ears
    caused by arrival-time delay (ITD)
  • Ear closest to sound source hears the sound
    before the other ear

36
Interaural Arrival-Time Difference
37
Cones of Confusion
  • Binaural difference cues (IIDs and ITDs) result
    in a locus of points for which measurements will
    be the same
  • Results in ambiguity in the determination of
    sound source position

38
Cones of Confusion
39
How do humans resolve the Cones of Confusion
problem?
  • Cues used for localization are embodied in the
    free-field to the eardrum
  • The free-field is affected by sound shadowing
    from head and torso as well as diffractions from
    the pinna

40
Pinnas Effect On The Free-field
41
Head-related Transfer Function (HRTF)
  • The acoustic transfer function between a point in
    space and the eardrum of the listener
  • Encompasses all free-field effects

42
HRTF effect on IID
43
Monaural and Dynamic Cues
  • Spectral cues
  • Distance cues
  • Direct-to-reverberant energy ratio
  • High-to-low frequency energy ratio
  • Head rotation or tilt

44
Spectral Cues
  • Comparison of a known source spectrum with
    received spectrum
  • If spectrum is not known cues can still be
    obtained by assuming spectrum is locally flat (or
    constant slope)

45
Pinnas Effect on Spectrum
46
Distance Cues
  • Variation of signal level with distance
    (attenuation)
  • Useful only in regards to changes in distance or
    if the sound source has a known signal level

47
Direct-to-Reverberant Energy Ratio
  • Results from observation that reverberation level
    is constant over position in an enclosed space
  • But direct sound energy level decreases with
    increasing source-to-listener distance

48
High-to-Low Frequency Energy Ratio
  • Observation that air attenuates high frequencies
    more rapidly than low frequencies over distance

49
Head Rotation or Tilt
  • Rotation or tilt can alter interaural spectrum in
    predictable manner
  • Can resolve positional ambiguities on a cone of
    confusion

50
The Haas (or Precedence) Effect
  • The perceptual weighting of binaural cues of the
    first arriving sound over reflections of the same
    sound
  • Generally reveals true location of sound source
    while filtering out contradictory reflections
  • Hypothesized to be important from evolutionary
    standpoint

51
Interactive Audio
  • Virtual Sound Space
  • Facilitate the perception of monaural, binaural,
    and dynamic cues within the virtual environment
  • Model the virtual sound space in real-time

52
Digital Recording
  • Audio can be digitized and processed on a
    computer
  • Digital formats have frequency and dynamic range
    limitations

53
Digital Problems
  • Current formats do not completely cover the full
    frequency range of human hearing (especially low
    frequencies)
  • Representing 0-120dB would require too many bits!

54
Review
  • Distance cues (attenuation)
  • Direct-to-reverberant energy ratio
  • High-to-low frequency energy ratio
  • Doppler Effect
  • IID, ITD
  • Spectral cues (effects from head, pinna)

55
Attenuation
  • Inverse-Square Law
  • Overkill
  • Sounds fall off too fast
  • Solution Add an ambient term (just like in
    graphics)

56
Static Attenuation
  • Set sample volume based on distance
  • Volume level is only calculated at beginning of
    sample
  • Low CPU usage
  • Best for short duration samples
  • Bad for long duration samples

57
Dynamic Attenuation
  • Sample volume based on distance
  • Volume level recalculated every frame
  • Good for long duration samples
  • Higher CPU usage (3 multiplies every frame per
    sample)
  • Temporal Aliasing

58
Temporal Aliasing
  • You will hear Stair Stepping or discrete volume
    levels as attenuation is recalculated
  • Solution Increase update rate
  • Rule of Thumb At least 20Hz (twice as much as
    needed for VR graphics)
  • Some cues more susceptible than others

59
Stereo Attenuation
  • Set sample volume based on distance per channel
    (left and right)
  • Even higher CPU usage (3 multiplies
    trigonometry per frame, per sample)
  • Gross approximation of IID

60
Stereo Attenuation
61
Multiple Channel Audio
  • More than 2 speakers
  • Typically oriented in a horizontal plane around
    the user
  • Usually 4 or 5 directional speakers (Surround
    Sound or Dolby Digital)
  • Good for directional cues
  • Expensive to calculate (probably need hardware
    supportespecially for Surround Sound or Dolby
    Digital)

62
Stereo Extenders
  • Processing techniques for increasing stereo
    spread
  • Processed after stereo attenuation is calculated
    (DSP inside speakers usually)
  • Example QSound

63
Stereo Extenders
64
Solution to the Dynamic Range Problem
  • Assume that an individual sound will not have
    much dynamic range
  • Scale the attenuation function to fit a min and
    max distance

65
Solution to the Dynamic Range Problem
66
Sound Source With Limited Dynamic Range
67
Modeling Interaural Arrival-Time Difference
  • Want to introduce phase difference between left
    and right ear
  • PROBLEM left ear must only hear what was meant
    for left ear same for right ear!

68
How to control what ears hear?
  • Easy solution Head phones
  • Hard solution Cross-talk cancellation

69
Headphone Solution
  • Precise control of what each ear hears
  • Good for VR (immersive)
  • Not good for multi-user VR (CAVE)
  • Cumbersome
  • Need to track users head for proper HRTF
    calculations
  • If using HRTF, ear buds are ideal (remove effect
    of pinna)

70
Cross-talk Cancellation
  • Left speaker plays left channel and the
    cancellation of the right channel (same for
    right)
  • Results in a sweet spot where left ear will only
    hear left channel and right ear will only hear
    right channel

71
Cross-talk Cancellation
72
Problems with Cross-Talk Cancellation
  • Sweet Spot is a single user experience
  • Implementation requires intimate knowledge of
    advanced calculus and Fourier Analysis
  • Speakers must be accurately placed and oriented
  • Needs dedicated DSP hardware

73
Calculating ITD Effects
  • Determine distance from sound source to each ear
  • Simple physics to determine arrival time of sound
    to each ear
  • Heavy Duty math required to smoothly interpolate
    phase changes

74
Pinna, Head, and Shoulders
  • Determine HRTF from spectral analysis of
    head-related impulse response (HRIR)
  • Filter sounds by scaling intensities at each
    frequency
  • Definitely need dedicated hardware

75
Determining the HRTF from head-related impulse
response (HRIR)
Microphone for recording HRIRs
76
HRIRs
77
HRIRs
78
Generic HRTF
  • Use of an average HRIR to determine HRTF
  • Works fairly well for 80 of people
  • Custom HRTFs are quite often impractical

79
Environmental Effects
  • Obstruction/Occlusion
  • Reverberation
  • Doppler Shift
  • Atmospheric Effects

80
Obstruction
  • Same as sound shadowing
  • Generally approximated by a ray test and a low
    pass filter
  • High frequencies should get shadowed while low
    frequencies diffract

81
Obstruction
82
Occlusion
  • A completely blocked sound
  • Example A sound that penetrates a closed door or
    a wall
  • The sound will be muffled (low pass filter)

83
Reverberation
  • Effects from sound reflection
  • Similar to echo
  • Static reverberation
  • Dynamic reverberation

84
Static Reverberation
  • Relies on the closed container assumption
  • Parameters used to specify approximate
    environment conditions (decay, room size, etc.)
  • Example Microsoft DirectSound3D EAX

85
Static Reverberation
86
Dynamic Reverberation
  • Calculation of reflections off of surfaces taking
    into account surface properties
  • Typically diffusion and diffraction ignored
  • Wave Tracing
  • Example Aureal A3D 2.0 or Beam Tracing Paper

87
Dynamic Reverberation
88
Comparison
  • Static Reverberation less expensive
    computationally, simple to implement
  • Dynamic Reverberation very expensive
    computationally, difficult to implement, but
    potentially superior results

89
Doppler Shift
  • Change in frequency due to velocity
  • Very susceptible to temporal aliasing
  • The faster the update rate the better
  • Requires dedicated hardware

90
Atmospheric Effects
  • Attenuate high frequencies faster than low
    frequencies
  • Moisture in air increases this effect

91
Applications and Current Research
  • Beam Tracing
  • NAVE
  • Effect of Audio on visual quality
  • Audio Spotlight

92
Beam Tracing
  • Video! (from Siggraph 98 Conference Proceedings
    Video Tape)
  • From paper A Beam Tracing Approach to Acoustic
    Modeling for Interactive Virtual Environments,
    Thomas Funkhouser

93
NAVE
  • HRTF (ITD, IID) via cross-talk cancellation of
    two front speakers (SBLive! DS3D)
  • Two rear speakers provide directional and
    intensity cues
  • Discrete bass channel (2nd sound card)
  • Static reverberation (EAX)

94
Effect of Audio on Visual Quality
  • GT Study shows that ambient sounds enhance sense
    of presence, as well as subjective quality of 3D
    graphics
  • Enhanced recall and recognition of visual objects
  • Dr. Russell Storms study showed enhanced
    subjective quality of 2D graphics

95
Audio Spotlight
  • Produces audio beam (like a flash light)
  • Makes use of interference from ultrasonic waves
  • Potentially great dynamic range (better than
    speaker cones)

96
Audio Spotlight
97
Audio Spotlight
98
Audio Spotlight Compared to Speaker
99
Audio Spotlight Beam Dimensions
100
Audio Spotlight Distortion
101
Audio Spotlight
  • Holy Grail of interactive audio?
  • Avoid cross-talk cancellation
  • Track users ears and aim spotlight at head
  • AR aim it at objects

102
Open Research
  • Diffusion (some work with radiosity)
  • Diffraction
  • HRTFs with audio spotlights
  • Integration of graphics and audio hardware for
    wave tracing (Nvidia?)
Write a Comment
User Comments (0)
About PowerShow.com