Title: Auralization
1Auralization
- Lauri Savioja
- (Tapio Lokki)
- Helsinki University of Technology, TKK
2AGENDA, 845 920
- Auralization, i.e., sound rendering
- Impulse response
- Basic principle Marienkirche demo
- Source signals and modeling of directivity of
sources - Modeling from perceptual point of view
- Dynamic auralization
- Evaluation of auralization quality
- Spatial sound reproduction
- Headphones
- Loudspeakers
3Impulse response of a room
4Impulse response of a room
5Impulse response
- A linear time-invariant system (LTI) can be
modeled with an impulse response - The output y(t) is the convolution of the input
x(t) and the impulse response h(t) - Discrete form (convolution is sum)
6Measured (binaural) impulse response of Tapiola
concert hall
7Two goals of room acoustics modeling
- Goal 1 room acoustics prediction
- Static source and receiver positions
- No real-time requirement
- Goal 2 auralization, sound rendering
- Possibly moving source(s) and listener, even
geometry - Both off-line and interactive (real-time)
applications - Need of anechoic stimulus signals
(Binaural rendering, Lokki, 2002)
8Goal 2 Auralization / sound rendering
- Auralization is the process of rendering
audible, by physical or mathematical modeling,
the sound field of a source in a space, in such a
way as to simulate the binaural listening
experience at a given position in the modeled
space. (Kleiner et al. 1993, JAES) - Sound rendering plausible 3-D sound, e.g., in
games - 3-D model ? spatial IR dry signal
auralization
9Auralization
- Goal Plausible 3-D sound, authentic auralization
- The most intuitive way to study room acoustic
prediction results - Not only for experts
- Anechoic stimulus signal
- Reproduction with binaural or multichannel
techniques - Impulse response has to contain also spatial
information
10Auralization, input
- Input data
- Anechoic stimulus signal(s) !
- Geometry material data
- source(s) and receiver(s) locations and
orientations
11Auralization, modeling
- Source(s) omnidirectional, sometimes directional
- Medium
- physically-based sound propagation in a room
- perceptual models, i.e., artificial reverb
- Receiver spatial sound reproduction (binaural or
multichannel)
12Marienkirche, concert hall in Neubrandenburg
(Germany)
13source medium receiver
(Savioja et al. 1999, Väänänen 2003)
14Source Modeling stimulus signal
- Stimulus
- Sound signal synthesis
- Anechoic recordings
15Source Modeling - Radiation
- Directivity is a measure of the directional
characteristic of a sound source. - Point sources
- omnidirectional
- frequency dependent directivity characteristics
- Line and volume sources
- Database of loudspeakers http//www.clfgroup.org/
16Anechoic stimulus signals
- In a concert hall typical sound source is an
orchestra - Anechoic recordings needed
- Directivity of instruments also needed
- We have just completed such recordings
- Demo
- All recordings with 22 microphones
- Recordings are publicly available for Academic
purposes - Contact Tapio.Lokki_at_tkk.fi
- http//auralization.tkk.fi
17Sound field decomposition (Svensson, AES22nd
2002)
diffuse reflections handled by surface sources
18Computation vs. human perception
Computation vs. Frequency resolution
Computation vs. Time resolution
(Svensson Kristiansen 2002)
19Two approaches
Perceptually-based
Physically-based
(Väänänen, 2003)
20Auralization Two approaches (1)
- Perceptually-based modeling
- Impulse response is not computed with a geometry
- A statistical response is applied
- Psychoacoustical (subjective) parameters are
applied in tuning the response - e.g. reverberation time, clarity, warmness,
spaciousness - Applications music production, teleconferencing,
computer games...
21Auralization Two approaches (2)
- Physically-based modeling
- Sound propagation and reflections of boundaries
are modeled based on physics. - Impulse response is predicted based on the
geometry and its properties depend on surface
materials, directivity and position of sound
source(s) as well as position and orientation of
the listener - Applications prediction of acoustics, concert
hall design, virtual auditory environments for
games and virtual reality applications,
education, ...
22Dynamic auralization (sound rendering)
- Method 1 A grid of impulse responses is computed
and convolution is performed with interpolated
responses - Applied in the CATT software (http//www.catt.se)
- Method 2 Parametric rendering
23Typical Auralization System
1. Scene definition 2. Parametric presentation
of sound paths 3. Auralization with parametric
DSP structure
24Auralization parameters
- For the direct sound and each image source the
following set of auralization parameters is
provided - Distance from the listener
- Azimuth and elevation angles with respect to the
listener - Source orientation with respect to the listener
- Reflection data, e.g. as a set of filter
coefficients which describe the material
properties in reflections
25Treatment of one image source a DSP view
- Directivity
- Air absorption
- Distance attenuation
- Reflection filters
- Listener modeling
- Linear system
- Commutation
- Cascading
(Adapted from Strauss, 1998)
26Auralization block diagram
27Treatment of each image source
28Late reverberation algorithm
- A special version of feedback delay network
(Väänänen et al. 1997)
29A Case Study a Lecture Room
30Image sources 1st order
31Image sources up to 2nd order
32Image sources up to 3rd order
33Distance attenuation
34Distance attenuation (zoomed)
35Gain air absorption
36Gain air and material absorption
37All monaural filtering
38All monaural filtering (zoomed)
39Treatment of each image source
40Only ITD for pure impulse
41Only ITD for pure impulse (zoom)
42ITD minimum phase HRTF
43Monaural filterings ITD
44Monaural filterings ITD HRTF
45Auralization block diagram
46Reverb
47Image sources reverberation
48Image sources reverberation
49Image sources reverberation
50Dynamic Sound Rendering
- Dynamic rendering
- Properties of image sources are time variant
- The coefficients of filters are changing all the
time - Every single parameter has to be interpolated
- In delay line pick-ups the fractional delay
filters have to be used to avoid clicks and
artifacts - Late reverberation is static
- Update rate ? latency
51Auralization quality
- What is the wanted quality?
- Assesment of quality is possible only by case
studies - Objectively
- Acoustical attributes
- With auditory modeling
- Subjectively
- Listening tests
52A case study, lecture hall T3
53Quality of auralization (Lokki, 2002)
Stimuli clarinet drum
Results clarinet recording
auralization
Results drum recording
auralization
54Spatial auditory display
- Nicolas TsingosLauri Savioja
55Spatial Sound Reproduction Techniques
- Reproduce the correct perceived
location/direction of a virtual sound source to
the ears of the listener - Headphone or speaker based.
Binaural stereo
Multiple speakers
56Binaural and Transaural Stereophony
- Natural filtering of the ears and torso
- Apply a directional filtering to the signal
- Head Related Transfer Functions (HRTFs)
- Headphones (binaural)
- Speaker pair (transaural)
57Head Related Transfer Functions
- Modeling
- Finite element techniques
- Measuring
- Dummy-heads
- Human listener
- HRTFs strongly depend on the listener
- Morphological differences
- Adaptation by scaling in frequency domain
58HRTF filter design
- Filters separated into two parts
- 1. Inter-aural time difference (ITD)
- 2. Minimum-phase FIR-filter
- In movements
- Linear interpolation of ITD
- Bilinear interpolation for FIR
59Implementing HRTFs
- Principal component analysis
- HRTF is a linear combination of eigenfilters
- Allows for smooth interpolation
- Allows for reducing the number of operations
60Transaural Stereophony
- Cross talk cancellation
- Hll and Hrr are HRTFs
- Hrl and Hlr ?
61Amplitude/Intensity Panning
- The common surround sound
- Apply the proper gain to every speaker to
reproduce the proper perceived direction - in 2D pair of loudspeakers
- in 3D loudspeaker triangle
- Vector-Base Amplitude Panning (image from Ville
Pulkki, TKK)
62Ambisonics
- Spherical harmonics decomposition of the
pressure field at a given point - 1st order spherical harmonics
- Sound field can be reproduced from 4 components
- 1 omnidirectional and 3 orthogonal figure-of-8
- Allows for manipulating the sound-field
- Rotations, etc.
63Wave Field Synthesis
- Reproduce the exact wave-field in the
reproduction regions - Use speakers on the boundary
- Kirchoff integral theorem
- Sound field valid everywhere in the room
- Heavy resources
- In practice limited to a planar configuration
64Comparison
Technique Setup( chans) DSP elevation imaging Sweet spot recording
HRTF light (2) moderate yes v.good n/a yes
Transaural light (2) moderate yes good small yes
AmplitudePanning average (5) low yes (3D array) average medium no
Ambisonics average (4) moderate yes (3D array) good small yes
WFS heavy (100) high ? v.good n/a ?
65Which Setup for which Environment ?
- Binaural systems for desktop use
- Includes stereo transaural
- Multi-speaker systems for multi-user
- Well suited to immersive projection-based VR
systems - Projection screens act as low-pass filters
- Video projection constraints
66Other Issues for Immersive Environments
- Overall system latency
- Less than 100ms is OK
- Tracking the users head
- Update binaural/transaural filters
- Correction of loudspeakers gains
- Room problems
- Reflective surfaces
67Summary
- Auralization
- Direct convolution with full directional impulse
responses - Computationally too heavy in practice
- Parametric impulse response rendering
- Early reflections treated separately
- Statistic late reverberation
- Spatial sound reproduction
- Headphones HRTFs
- Loudspeakers VBAP, Ambisonics, Wave Field
Synthesis
68Thank you for your attention!ContactLauri.Sav
ioja_at_tkk.fihttp//auralization.tkk.fi