Perceptual Organization - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Perceptual Organization

Description:

Recent studies have shown that this is wrong, primitive does require attention. If attending ... Guy Brown. Niall Griffith. Rianna Walsh. Chris Darwin. Sue ... – PowerPoint PPT presentation

Number of Views:816
Avg rating:3.0/5.0
Slides: 37
Provided by: Perfecto3
Category:

less

Transcript and Presenter's Notes

Title: Perceptual Organization


1
Perceptual Organization
  • Perfecto Herrera
  • Music Perception and Cognition

2
The problem(s) of perceptual organization
3
Some terms
  • Source the physical entity that gives rise to
    the sound pressure waves e.g. a violin being
    played
  • Stream the percept of a group of successive
    and/or simultaneous sounds as a coherent whole
    appearing to come from a single source
  • The sounds we hear at any one time usually come
    from a number of different sources.
  • In most cases we can hear and identify each of
    the different sound sources as having its own
    pitch, timbre, loudness and location.

4
Auditory Scene Analysis
  • A computational theory of hearing is required
    plus a functional explanation of the information
    processing problems that the auditory system must
    solve in order to make sense of the acoustic
    environment
  • Work in computer vision has benefited from a
    computational theory since the late 1970s, due
    to David Marr
  • A similar foundation for hearing was developed by
    Albert Bregman at McGill University in Montreal
    and is known as auditory scene analysis

5
Auditory Scene Analysis
  • ASA can be conceptualized as a two-stage process
  • The mixture of sounds is decomposed into a
    collection of sensory elements (onsets, pitch
    trajectories, modulations, spectral tracks, etc.)
  • Elements that are likely to have arisen from the
    same event are grouped to form a perceptual
    structure (stream) which can be interpreted by
    higher centers
  • For example, when listening to a violin
    performance, it is the task of auditory scene
    analysis to group the acoustic events emitted
    from the physical source (the violin) into a
    perceptual stream (the mental experience of a
    violin being played).

Is this the only way of listening? What about
reduced listening? Read Pierre Schaeffer
6
Auditory Scene Analysis
  • In most listening situations, a mixture of sounds
    reaches the ears. However we can
  • Attend to one conversation amid many competing
    voices and other background sounds (e.g. music)
    at a cocktail party
  • Follow the melodic line played by the violins in
    an orchestral recording.
  • This problem is of great scientific interest, and
    a solution also has engineering applications
  • -gt The Holy Grial!!!
  • Auditory image of Bachs Mass in Bm, consisting
    of voice, violin, cello etc.
  • How does the auditory system process this image
    to recover a description of each source?

7
  • Active Perception expectation-based processing
    (bottom-up top-down)

8
Auditory Scene Analysis
  • The inner ear separates sound into its frequency
    components
  • At some point in the auditory system these
    components need to be assigned to the appropriate
    sound source
  • Often called perceptual grouping, or auditory
    scene analysis
  • Two aspects simultaneous grouping and sequential
    grouping

9
Auditory Scene Analysis
  • Simultaneous grouping the grouping together of
    the simultaneous frequency components that come
    from a single source
  • Sequential grouping the connecting over time of
    the changing frequencies that a single source
    produces from one moment to the next

10
Example simultaneous grouping and sequential
grouping
11
Antecedents Gestalt Psychology
  • Gestalt means pattern in German
  • Gestalt Psychology originated in early XXth
    century Max Wertheimer (1880-1943), Wolfgang
    Köhler (1887-1967) and Kurt Koffka (1886-1941)
  • The basic principles underlying Gestalt
    psychology are
  • The whole is greater than the sum of the parts
  • The parts are defined by the whole as much as
    vice versa
  • Gestalt psychologists are best known for their
    work in vision but their principles are also
    applicable to auditory perception.
  • They systematically developed a set of principles
    of perceptual organisation (believed to be
    innate) that they thought determine how we
    assemble or associate components in a perceptual
    field
  • These principles are

12
Gestalt Psychology Principles
  • Proximity
  • Similarity
  • Common Fate (Common Direction)
  • Good Continuation
  • Disjoint Allocation (Belongingness)
  • Closure

Bottom Up Hard wired, Pre-attentive, Not
Learned (primitive)
Top Down Plastic, Learned (schema-driven)
13
Proximity
  • In vision when elements in an image are close
    together they are perceived to be together and
    separate from others that are further away, even
    though they are similar
  • In hearing, sounds occurring together over time
    are clustered

14
Similarity
  • Two or more auditory events are grouped if they
    are similar in timbre, pitch, loudness or close
    in apparent location or time
  • Fundamentals in same region but harmonics are
    not, leads to fission i.e. Different timbres but
    same pitch unfused
  • Harmonics in same region but fundamentals not,
    leads to fusion i.e. Different pitches but same
    timbre fused
  • This is not clear-cut depends on individual
    differences.
  • If the difference in loudness is large enough
    they form different streams either can be
    attended to
  • Same dB ? single stream at twice the tempo

15
Common Fate
  • Components in sound act together
  • They tend to start and finish together
  • They tend to change in pitch or intensity
    together
  • Therefore if we have a complex sound and the
    components are co-ordinated then they are fused,
    e.g. onset disparities, and AM and FM (tremolo
    vibrato)
  • For example if harmonics 2,4 and 8s frequency is
    modulated (FM) they separate from harmonics 3,5,6
    and 7
  • Or if the frequency of the 1st harmonic is
    modulated (FM) at a different rate it separates
    from harmonics 3,4 and 5

16
Good Continuation
  • Natural sound sources tend change gradually
    rather than abruptly in frequency, intensity,
    location or timbre
  • Abrupt change ? new stream ? new source
  • Low and high tones tend to split into streams
    this can be suppressed by putting glides in
    between In speech if there are oscillations in
    frequency it gives the impression that there are
    two speakers saying the one word
  • In music in general if a note is near in pitch to
    the one just before it then it will be heard as
    the next note in the melody rather than a note
    that is separate - higher or lower

17
Disjoint Allocation (Belongingness)
  • One component can only come from one source
    i.e. hearing tries to use each component only
    once
  • Say we have two tones at slightly different
    pitches and these can either be heard in
    isolation or embedded in another series of
    pitches thus In isolation the order of AB or BA
    is easily judged.
  • The addition of pitches (Xs) that are close in
    pitch to AB act as distracters making it
    difficult to order AB (This is thought to be
    because we attend more to the start and end of
    sequences).
  • But if more Xs are added, they form a stream
    that is separate from AB and again the order of
    AB is easily judged.
  • This not hard fast ambiguity is possible and
    this shows that this level of organisation is on
    the boundary of being pre-attentive and attentive
  • It also shows how the addition of new elements
    changes the perceptual organisation of the
    stimulus.

18
Closure
  • A source maybe obscured or absent but its
    percept continues
  • e.g. FM radio disturbance from ignition of
    passing cars we hear a click over the sound
    whereas in fact the radio is producing only a
    click
  • A pitched sound that is broken but the gap is
    filled by noise seems unbroken
  • Similarly a glide that is broken but the gap is
    filled with noise seems unbroken

19
Auditory Scene Analysis
  • Bregman re-examines the Gestalt principles and
    proposes the simultaneous and sequential grouping
    cues as the basic elements of information that
    help to organize our perception what, when,
    where, how
  • Bregman, A. S. (1990) Auditory scene analysis
    the perceptual organisation of sound. Cambridge,
    Mass. The MIT Press
  • But see also
  • Wang, D. Brown, G. (Editors) (2006).
    Computational Auditory Scene Analysis
    Principles, Algorithms and Applications. New
    York Wiley.

20
Simultaneous grouping
  • Some cues
  • Fundamental Frequency and Spectral Regularity
  • Onset Timing
  • Correlated changes in Amplitude or Frequency
  • Sound Location
  • Important A single cue may not be effective all
    the time these cues work together for
    perceptual organisation of the input sound

21
Fundamental Frequency
  • Consider two musical instruments each playing a
    note simultaneously
  • It is easier to hear each note and each
    instrument if they are playing different notes
    (have different fundamental frequencies)
  • Simultaneous sounds are more likely to fuse if
    they have the same fundamental frequency
  • It has been shown that a pair of simultaneously
    presented vowels are easier to identify if their
    fundamental frequencies differ

22
Spectral Regularity
  • Perceptual fusion of the frequency components
    from a harmonic sound harmonicity heard as a
    single sound
  • If a frequency component does not form part of
    the harmonic series it tends to be heard out
    separately as if part of a different source

23
Onset disparities
  • Perceptual separation on tones enhanced by onset
    asynchrony.
  • A frequency component that stops or starts at a
    different time from the complex sound is less
    likely to be heard as part of it than if it is
    simultaneous with it
  • Importance to make a soloist standing out

24
Onset disparities
  • We can hear each of two simultaneously played
    notes easier if there is a small onset difference
    between them
  • These onset asynchronies are up to 30ms so the
    percept is still of the notes sounding together
  • The auditory system can exploit these onset
    differences even though we are not consciously
    aware of them
  • Ensemble playing completely synchronised?

25
Onset disparities
  • Shorter rise times easier to hear the order of
    the tones
  • Generally, sounds with abrupt onsets (shorter
    rise time) stand out better from a background of
    other sounds than do slow-rising sounds
  • Shorter rise times aids the perceptual
    segregation of sounds to tell them apart
  • Rapid onset sounds e.g. notes from plucked or
    struck instruments
  • Why many musical systems combine abrupt slow
    attacking sounds?

26
Correlated Changes in Amplitude or Frequency
  • A sound may be perceptually segregated from an
    unchanging background if its components are
    modulated in amplitude or frequency
  • Hear harmonic complex tone
  • Harmonics 1, 3, 5, 6, 7 remain steady
  • Harmonics 2, 4, and 8 rise and fall in frequency
    four times
  • Hear the two sets as separate sounds

27
Sound Location
  • Sounds coming from different locations in space
    are generally assumed to be from different
    sources
  • But this is a weak cue for simultaneous
    grouping it becomes stronger for sequential
    grouping

28
Sequential organisation
  • Events in the world occur over time. We organise
    sounds into sequences over time using various
    criteria
  • Events that are similar in some way (e.g. in
    loudness or pitch) or going in the same direction
    (e.g. rising or falling) are perceived to have
    the same origin.
  • Music uses this principle
  • Streams are created by differences in pitch,
    loudness, timbre, repetition rate etc and by
    combining these in different ways.
  • Characteristics of Streams
  • Streams are separate we only attend to one
    fully at a time.
  • Foreground and Background possibly 3 maximum
  • Streams organisation is relative rather than
    absolute
  • Stream organisation may change as the complexity
    of the stimulus changes
  • Some aspects of streaming are pre-attentive,
    others are attentive, i.e. attentive means that
    by attending to different aspects of a stimulus
    we hear different things

29
  • The Trill Phenomenon
  • Miller and Heise experimented with two
    alternating tones to see how close in pitch they
    had to be for people to hear a trill.
  • They observed two states
  • When the frequency difference is small the pitch
    moves continuously up and down (i.e. a trill)
  • When the frequency was large two separate tones
    were heard.
  • Miller called the breaking point the "trill
    threshold" - it is at approximately 3 semitones.
    Eventually, as the pitch variation drops to below
    the JND for pitch the trill will become vibrato
    (FM, see lecture 5)
  • The trill phenomenon is an example of auditory
    grouping by proximity or similarity of pitch. In
    the first pair of sequences below the x's and z's
    are seen as separate "objects", but in the second
    we see a single zigzag of xs and zs

30
Sequential grouping
  • Periodicity cues periodic oscillations help to
    group objects according to their rates
  • Spectral cues we tend to group in time elements
    that appear in the same spectral regions (e.g.,
    high partials vs. low partials)
  • Level (intensity) cues we tend to group in time
    elements of similar level
  • Spatial cues we tend to group in time elements
    coming from the same place

31
Features Important for Sequential Grouping
  • Spectral distribution (oldnew heuristic)

32
Streaming
  • What happens when pitch separation and/or
    repetition rate are varied?
  • If we compress the time dimension do we hear
    notes that are further apart in frequency
    belonging together?
  • This was tested by Van Noorden (1976,1977), who
    found
  • Segregation depends on repetition rate and pitch
    separation
  • When stream segregation occurs, we are unable to
    attend fully to the events in both streams at the
    same time
  • We find it difficult to distinguish the order of
    events across streams
  • We have trouble hearing the overall rhythm of the
    sequence

33
Features Important for Sequential Grouping
  • Frequency and temporal contiguity auditory
    streaming

Freq. separation
34
Attention and Streaming
  • Bregman proposed that auditory streaming was
    obligatory and did not depend on attention
  • Recent studies have shown that this is wrong,
    primitive does require attention.

Build up of streaming
If attending to another task
20s
20s
10s
35
The Figure-Ground Phenomenon and Attention
  • Generally we do not attend to every aspect of the
    auditory input certain parts are selected for
    conscious analysis
  • Complex sound is analysed into streams we
    attend to one stream at a time attended stream
    stands out perceptually rest of sound becomes
    less salient
  • Separation into attended and unattended streams
    is equivalent to the figure-ground phenomenon
  • Examples Attending to one conversation at a time
    at a party other conversations form a
    background music with soloists TV noisy home
  • Importance of changes the listeners attention
    is usually drawn to aspects of the sound that are
    changing it becomes figure while the relatively
    unchanging part(s) become background

36
  • Guess who wrote this text
  • It is not enough to be able to describe the
    response of single cells, nor predict the results
    of psychophysical experiments. Nor is it enough
    even to write computer programs that perform
    approximately in the desired way One has to do
    all these things at once, and also be very aware
    of the computational theory...

37
  • This presentation reused materials from
    educational and research slides and documents by
  • Dan Ellis
  • Guy Brown
  • Niall Griffith
  • Rianna Walsh
  • Chris Darwin
  • Sue Denham
Write a Comment
User Comments (0)
About PowerShow.com