The Design of Multidimensional Sound Interfaces Michael Cohen - PowerPoint PPT Presentation

1 / 53

About This Presentation

Title:

The Design of Multidimensional Sound Interfaces Michael Cohen

Description:

Title: Human Decision Making Author: tcastill Last modified by: tcastill Created Date: 9/29/1999 12:37:16 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:284

Avg rating:3.0/5.0

Slides: 54

Provided by: tcas6

Learn more at: http://cecs.wright.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Design of Multidimensional Sound Interfaces Michael Cohen

1
The Design of Multidimensional Sound
InterfacesMichael Cohen Elizabeth M. Wenzel
8

Presented by
Andrew Snyder Thor Castillo

February 3, 2000
HFE760 - Dr. Gallimore
2
Table of Contents

Introduction How we localize sound
Chapter 8
Research
Conclusion

3
Introduction

Ear Structure
Binaural Beats - Demo
Why are they important?
Localization Cues

4
Introduction

Ear Structure

5
Introduction

Binaural Beats Demo
Why are they Important?

6
Introduction

Localization Cues
Humans use auditory localization cues to help
locate the position in space of a sound source.
There are eight sources of localization cues
interaural time difference
head shadow
pinna response
shoulder echo
head motion
early echo response
reverberation
vision

7
Introduction

Localization Cues
Interaural time difference describes the time
delay between sounds arriving at the left and
right ears.
This is a primary localization cue for
interpreting the lateral position of a sound
source.

8
Introduction

Localization Cues
Head shadow is a term describing a sound having
to go through or around the head in order to
reach an ear.
The filtering effects of head shadowing cause one
to have perception problems with linear distance
and direction of a sound source.

9
Introduction

Localization Cues
Pinna response desribes the effect that the
external ear, or pinna, has on sound.
Higher frequencies are filtered by the pinna in
such a way as to affect the perceived lateral
position, or azimuth, and elevation of a sound
source.

10
Introduction

Localization Cues
Shoulder echo - Frequencies in the range of
1-3kHz are reflected from the upper torso of the
human body.

11
Introduction

Localization Cues
Head motion - The movement of the head in
determining a location of a sound source is a key
factor in human hearing and quite natural.

12
Introduction

Localization Cues
Early echo response and reverberation -Sounds in
the real world are the combination of the
original sound source plus their reflections from
surfaces in the world (floors, walls, tables,
etc.).
Early echo response occurs in the first 50-100ms
of a sounds life.

13
Introduction

Localization Cues
Vision helps us quickly locate the physical
location of a sound and confirm the direction
that we perceive

14
Chapter 8 Contents

Introduction
Characterization and Control of Acoustic Objects
Research Applications
Interface Control via Audio Windows
Interface Issues Case Studies

15
Introduction

I/O generations and dimensions
Exploring the audio design space

16
Introduction

I/O generations and dimensions
First Generation - Early computer terminals
allowed only textual i/o Character-based user
interface (CUI)
Second Generation - As terminal technology
improved, user could manipulate graphical objects
Graphical User Interface (GUI)
Third Generation 3D graphical devices.
3D audio The sound has a spatial attribute,
originating, virtually or exactly, from an
arbitrary point with respect to the listener
This chapter focused on the third-generation of
aural sector.

17
Introduction

Exploring the audio design space
Most people think that it would be easier to be
hearing- than sight- impaired, even though the
incidence of disability-related cultural
isolation is higher among the deaf than the
blind.
The development of user interfaces has
historically been focused more on visual modes
than aural.
Sound is frequently included and utilized to the
limits of its availability and affordability in
PCs. However, computer aided exploitation of
audio bandwidth is only now beginning to rival
that of graphics.
Because of the cognitive overload that results
from overburdening other systems (perhaps
especially the visual) there are strong
motivations for exploiting sound to its full
potential

18
Introduction

Exploring the audio design space
This chapter reviews the evolving state of the
art of non-speech audio interfaces, driving both
spatial and non-spatial attributes.
This chapter will focus primarily on the
integration of these new technologies crafting
effective matches between projected user desires
and emerging technological capabilities.

19
Characterization and Control of Acoustic Objects
Part of listening to a mixture of conversations
or music is being able to hear the individual
voices or musical instruments. This
synthesis/decomposition duality is the opposite
effect of masking instead of sounds hiding each
other, they are complementary and individually
perceivable. Audio imaging the creation of
sonic illusions by manipulation of stereo
channels. Stereo system sound comes from only
left and right transducers, whether headphones or
loudspeakers. Spatial sound involves technology
that allows sound to emanate from any direction.
(left-right, up-down, back-forth, and everything
in between)
20
Characterization and Control of Acoustic Objects

The cocktail party effectwe can filter sound
according to
position
speaker voice
subject matter
tone/timbre
melodic line and rhythm

21
Characterization and Control of Acoustic Objects

Spatial dimensions of sound
Implementing spatial sound
Non-spatial dimensions and auditory symbology

22
Characterization and Control of Acoustic Objects

Spatial dimensions of sound
The goal of spatial sound synthesis is to project
audio media into space by manipulating sound
sources so that they assume virtual positions,
mapping the source channel into three-dimensional
space. These virtual positions enable auditory
localization.
Duplex Theory (Lord Rayleigh, 1907) human sound
localization is based on two primary cues to
location, interaural differences in time of
arrival and interaural differences in intensity.

23
Characterization and Control of Acoustic Objects

Spatial dimensions of sound
There are several problems with the duplex
theory
Cannot account for the ability of subjects to
localized many types of sounds coming from many
different regions (ex. Sound along the median
plane)
When using duplex to generate sound cues in
headphones, the sound is perceived inside the
head
Most of the deficiencies with the duplex theory
are linked to the interaction of sound waves in
the pinnae (outer ears)

24
Characterization and Control of Acoustic Objects

Spatial dimensions of sound
Peaks and valleys in the auditory spectrum can be
used as localization cues for elevation of the
sound source. Other cues are also necessary to
locate the vertical position of a sound source.
This is very important to researchers since it
has never been truly understood.

25
Characterization and Control of Acoustic Objects

Spatial dimensions of sound
Localization errors in current sound generating
technologies is very common, some of the problems
that persist are
Locating sound on the vertical plane
Some systems can cause a front ? back reversal
Some systems can cause an up ? down reversal
Judging distance from the sound source! Were
generally terrible at doing this anyways!!!
Sound localization can be dramatically improved
with a dynamic stimulus (can reduce amount of
reversals)
Allowing head motion
Moving the location of the sound
Researchers suggest that this can help
externalize sound!!!

26
Characterization and Control of Acoustic Objects

Implementing spatial sound
Physically locating loudspeakers in the place
were each source is located, relative to the
listener. (Most direct forward)
Not portable Cumbersome
Other approaches use analytic mathematical models
of the pinnae and other body structures in order
to directly calculate acoustic responses.
A third approach to accurate real-time
spatialization concentrates on digital sound
processors (DSP) techniques for synthesizing cues
from direct measurements of head related transfer
functions. (The author focuses on this type of
approach)

27
Characterization and Control of Acoustic Objects

Implementing spatial sound
DSP The goals is to make sound spatializers
that give the impression that the sound is coming
from different sources and different locations.
Why? - A display that focuses on this technology
can exploit the human ability to quickly and
subconsciously locate sound sources.
Convolution Hardware and/or Software based
engines performs the convolution that filters the
sound in some DSPs

28
Characterization and Control of Acoustic Objects

Implementing spatial sound
Crystal River Engineering Convolvotron
Gehring Research Focal Point
AKG CAP (Creative Audio Processor)
Head Acoustics
Roland Sound Space (RSS) Processor
Mixels

29
Characterization and Control of Acoustic Objects

Implementing spatial sound
Crystal River Engineering Convolvotron
What is it? It is a convolution engine that
spatializes sound by filtering audio channels
with transfer functions that simulate positional
effects.
Alphatron Acoustetron II
The technology is good except for time delays do
to computation of 30-40 ms (which can be picked
up by the ear if used with visual inputs)

30
Characterization and Control of Acoustic Objects

Implementing spatial sound
Gehring Research Focal Point
What is it? Focal Point comprises two binaural
localization technologies, Focal Point Type 1 and
2.
Focal Point 1 the original Focal Point
technology, utilizing time-domain convolution
with head related transfer function based impulse
responses for anechoic simulation.
Focal Point 2 a Focal Point implementation in
which sounds are preprocessed offline, creating
interleaved sound files which can then be
positioned in 3D in real-time upon playback.

31
Characterization and Control of Acoustic Objects

Implementing spatial sound
AKG CAP (Creative Audio Processor)
What is it? A kind of binaural mixing console.
The system is used to create audio recordings
with integrated Head Related Transfer Functions
and other 3D audio filters.

32
Characterization and Control of Acoustic Objects

Implementing spatial sound
Head Acoustics
What is it? A research company in Germany that
has developed a spatial audio system with an
eight-channel binaural mixing console using
anechoic simulations as well as a new version of
an artificial head

33
Characterization and Control of Acoustic Objects

Implementing spatial sound
Roland Sound Space (RSS) Processor
What is it? Roland has developed a system which
attempts to provide real-time spatialization
capabilities for both headphones and stereo
loudspeaker presentation. The basic RSS system
allows independent placement of up to four
sources using domain convolution.
What makes this system special is that it
incorporates a technique know as transaural
processing, or crosstalk cancellation between the
stereo speakers. This technique seems to allow an
adequate spatial impression to be achieved.

34
Characterization and Control of Acoustic Objects

Implementing spatial sound
Mixels
The number of channels in a system corresponds to
the degree of spatial polyphony, simultaneous
spatialized sound sources, the system can
generate. In the assumption that systems will
increase their capabilities enormously, via
number of channels, we label their number of
channels as Mixels.
By way of analogy to pixels and voxels, the
atomic level of sound is sometimes called mixels,
acronymic for sound mixing elements

35
Characterization and Control of Acoustic Objects

Implementing spatial sound
Mixels
But, rather than diving in deeply into more
spatial audio systems, the rest of the chapter
will concentrate on the nature of control
interfaces that will need to be developed to take
full advantage of these new capabilities.

36
Characterization and Control of Acoustic Objects

Non-spatial dimensions and auditory symbology
Auditory icons - acoustic representations of
naturally occurring events that caricature the
action being represented
Earcons elaborated auditory symbols which
compose motifs into artificial non-speech
language, phrases distinguished by rhythmic and
tonal patterns

37
Characterization and Control of Acoustic Objects

Non-spatial dimensions and auditory symbology
Filtears a class of cues that independent of
distance and direction. They are used to attempt
to expand the spectrum of how we used sound. Used
to create sounds with attributes attached to
them. Think of it as sonic typography placing
sound in space can be likened to putting written
information on a page. Filtears are dependant on
source and sink.
Example Imagine your telenegotiating with many
people. You can select attributes of a persons
voice. (distance from you, direction,
indoors-outdoors, whispers behind your ear, etc)

38
Research Applications

Virtual acoustic displays featuring spatial sound
can be thought of as enabling two performance
advantages
Situation Awareness Omnidirectional monitoring
via direct representation of spatial information
reinforces or replaces information in other
modalities, enhancing ones sense of presence or
realism.
Multiple Channel Segregation can improve
intelligibility, discrimination, selective
attention among audio sources.

39
Research Applications

Sonification
Teleconferencing
Music
Virtual Reality and Architectural Acoustics
Telerobotics and Augmented Audio Reality

40
Research Applications

Sonification
Sonification can be thought of as auditory
visualization and can be used as a tool for
analysis, for example, presenting multivariate
data as auditory patterns. Because visual and
auditory channels can be independent from each
other, data can be mapped differently to each
mode of perception, and auditory mappings can be
used to discover relationships that are hidden in
the visual display.

41
Interface Control via Audio Windows

Audio Windows is an auditory-object manager.
The general idea is to permit multiple
simultaneous audio sources, such as
teleconference, to coexist in a modifiable
display without clutter or user stress.

42
Interface Design Issues Case Studies

Veos and Mercury (written with Brian Karr)
Handy Sound
Maw

43
Interface Design Issues Case Studies

Veos and Mercury (written with Brian Karr)
Veos - Virtual Environment Operating System
Sound Render Implementation - A software package
that interfaces with a VR system (like Veos).
The Audio Browser - A hierarchical sound file
navigation and audition tool.

44
Interface Design Issues Case Studies

Handy Sound
Handy Sound explores gestural control of an audio
window system.
Manipulating source position in Handy Sound
Manipulating source quality in Handy Sound
Manipulating sound volume in Handy Sound
Summary - Handy sound demonstrates the general
possibilities of gesture recognition and spatial
sound in a multichannel conferencing environment.

45
Interface Design Issues Case Studies

Maw
Developed as an interactive frontend for
teleconferencing, Maw allows the user to arrange
sources and sinks in a horizontal plane.
Manipulating source and sink positions in Maw
Organizing acoustic objects in Maw
Manipulating sound volume in Maw
Summary

46
Conclusion

Real world examples

47
Sound authoring tools for future multimedia
systems

Bezzi, Marco De Poli, Giovanni Rocchesso,
Davide
Univ di Padova, Padova, Italy
Summary
A framework for authoring non-speech sound
objects in the context of multimedia systems is
proposed. The goal is to design specific sound
and their dynamic behavior in such a way that
they convey dynamic and multidimensional
information. Sound are designed using a
three-layer abstraction model physically-based
description of sound identity, signal-based
description of sound quality, perception- and
geometry-based description of sound projection in
space. The model is validated with the aid of an
experimental tool where manipulation of sound
objects can be performed in three ways handling
a set of parameter control sliders, editing the
evolution in time of compound parameter settings,
via client applications sending their requests to
the sounding engine. Author abstract 26 Refs
In English Conference Information Proceedings
of the 1999 6th International Conference on
Multimedia Computing and Systems - IEEE ICMCS'99
Jun 7-Jun 11 1999 Florence, Italy Sponsored by
IEEE CS IEEE Circuit and Systems Society

48
Interactive 3D sound hyperstories for blind
children

Lumbreras, Mauricio Sanchez, Jaime
Univ of Chile, Santiago, Chile
Summary
Interactive software is currently used for
learning and entertainment purposes. This type of
software is not very common among blind children
because most computer games and electronic toys
do not have appropriate interfaces to be
accessible without visual cues. This study
introduces the idea of interactive hyperstories
carried out in a 3D acoustic virtual world for
blind children. We have conceptualized a model to
design hyperstories. Through AudioDoom we have an
application that enables testing cognitive tasks
with blind children. The main research question
underlying this work explores how audio-based
entertainment and spatial sound navigable
experiences can create cognitive spatial
structures in the minds of blind children.
AudioDoom presents first person experiences
through exploration of interactive virtual worlds
by using only 3D aural representations of the
space. Author abstract 21 Refs In English
Conference Information Proceedings of the CHI
99 Conference CHI is the Limit - Human Factors
in Computing Systems May 15-May 20 1999
Pittsburgh, PA, USA Sponsored by ACM SIGCHI

49
Any questions???
50
References

Modeling Realistic 3-D Sound Turbulence
http//www-engr.sjsu.edu/duda/Duda.Reports.htmlR
1
3D Sound Aids for Fighter Pilots
http//www.dsto.defence.gov.au/corporate/history/j
ubilee/sixtyyears18.html
3D Sound Synthesis
http//www.ee.ualberta.ca/khalili/3Dnew.html
Binaural Beat Demo
http//www.monroeinstitute.org/programs/bbapplet.h
tml
Begault, Durand R. "Challenges to the Successful
Implementation of 3-D Sound", NASA-Ames Research
Center, Moffett Field, CA, 1990.
Begault, Durand R. "An Introduction to 3-D Sound
for Virtual Reality", NASA-Ames Research Center,
Moffett Field, CA, 1992.

51
References

Burgess, David A. "Techniques for Low Cost
Spatial Audio", UIST 1992.
Foster, Wenzel, and Taylor. "Real-Time Synthesis
of Complex Acoustic Environments" Crystal River
Engineering, Groveland, CA.
Stuart Smith. "Auditory Representation of
Scientific Data", Focus on Scientific
Visualization, H. Hagen, H. Muller, G.M. Nielson,
eds. Springer-Verlag. 1993.
Stuart, Rory. "Virtual Auditory Worlds An
Overview", VR Becomes a Business, Proceedings of
Virtual Reality 92, San Jose, CA, 1992.
Takala, Tapio and James Hahn. "Sound Rendering".
Computer Graphics, 26, 2, July 1992.

52
One last thing