Dynamics of Gestures: Temporal Patterning - PowerPoint PPT Presentation

1 / 68
About This Presentation
Title:

Dynamics of Gestures: Temporal Patterning

Description:

Question: What is being learned when we learn a skilled behavior? ... e.g., recruitment of trunk leaning or body twisting for reaching, depending on ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 69
Provided by: byrd2
Category:

less

Transcript and Presenter's Notes

Title: Dynamics of Gestures: Temporal Patterning


1
Dynamics of Gestures Temporal Patterning
Work supported by NIH grant DC-03663
  • Elliot Saltzman
  • Boston University
  • Haskins Laboratories

2
Colleagues
  • Dani Byrd
  • University of Southern California, USA
  • Louis Goldstein
  • Yale University Haskins Laboratories, USA
  • Hosung Nam
  • Yale University Haskins Laboratories, USA

3
Question What is being learned when we learn a
skilled behavior?
  • Answer The dynamical system, or coordinative
    structure, that shapes functional, coordinated
    activity defined across animal and environment
  • But what is a dynamical system?
  • Roughly, it is a system of interacting variables
    whose change over time are shaped by laws or
    rules of motion
  • what types of variables?
  • what types of rules of motion?

4
System states, parameters, and graphsand their
dynamics
  • Any dynamical system can be completely
    characterized according to three types of
    variablesstate, parameter, and graphand their
    dynamics (Farmer, 1986)
  • State variables a systems active degrees of
    freedom
  • defined by the number of autonomous 1st order
    equations used to describe the system
  • Ex) position velocity of the mass in a damped
    mass-spring system
  • Ex) activations of nodes in a connectionist
    network
  • State dynamics the forces (velocity vector
    field) defined in the space of state variables
    (state space) that shapes motion patterns of the
    state variables

5
System states, parameters, and graphsand their
dynamics(cont.)
  • System parameters
  • Ex) m, b, k, and escapement strength in a limit
    cycle equation
  • Ex) target position in a point attractor equation
  • Ex) pendulum length
  • Ex) inter-node synaptic connection strength in a
    connectionist network
  • Parameter dynamics the forces/processes that
    shape motion patterns of the system parameters
  • Ex) intentional changes in oscillation frequency
    in finger-wiggling experiment
  • Ex) actor-environment field equation for
    specifying target position in reaching
  • Ex) changing system eigenfrequency due to
    alteration of pendulum lengths in
    pendulum-swinging experiment
  • Ex) connectionist learning algorithms for
    changing system weights to solve a given
    computational task

6
System states, parameters, and graphsand their
dynamics(cont.)
  • System graph Architecture of the systems
    equation of motion
  • the parameterized set of relationships defined
    among a systems state variables
  • Ex) circuit diagram (e.g., Simulink)
    representation of mbk equation of motion
  • Ex) node/connection diagram in a connectionist
    network

7
System states, parameters, and graphsand their
dynamics(cont.)
  • Graph dynamics the forces/processes that
    change the system graph
  • state variables (i.e., system dimensionality)
  • Ex) recruitment/selection/assembly of degrees of
    freedom appropriate for task in a particular
    actor-environment context
  • e.g., recruitment of trunk leaning or body
    twisting for reaching, depending on distance to
    target
  • interconnection/linkage structure defined across
    state variables
  • Ex) learning/discovering appropriate interlimb
    oscillator coupling functions to perform bimanual
    mn rhythms
  • Ex) constructivist connectionist learning
    algorithms that add/delete nodes and/or
    connections to implement grammar appropriate
    for learning given class of functions

8
Outline of Remaining Presentation
  • Part 1 Overview and review of task-dynamic model
    of speech production
  • Four types of timing phenomena Intragestural,
    transgestural, intergestural, and global
  • Hybrid dynamical model Task dynamics recurrent
    connectionist network
  • Part 2 Focus on system graphs and intergestural
    timing/phasing in speech production
  • Influence of system graph on patterns of relative
    timing between vowels and consonants in syllables
  • Competitive, coupled oscillator model of syllable
    structure
  • task-dynamic model of intergestural phasing
    (Saltzman Byrd, 2000)

9
Outline of Remaining Presentation (cont.)
  • Part 3 State and/or parameter dynamics and
    transgestural timing
  • Phrasal boundary effects on local speaking rate
  • Prosodic gestures (p-gestures) induce local
    slowings of central clock
  • Part 4 Intragestural timing Gestural
    anticipation intervals
  • Self-organization of gestural onsets given
    required times of target attainment
  • Constrained temporal elasticity of anticipation
    intervals

10
Part 1 Overview and Review
  • General Theoretical Question
  • How can we characterize the dynamics that
    underlie the temporal coordination among the
    units (gestures) of speech?

11
Dynamics Defined
  • Dynamics
  • Laws or rules that specify the forces that
    change a systems variables (system state) from
    one moment to the next

12
Speech Gestures
  • Equivalence classes of goal-directed actions by
    different sets of articulators in the vocal tract
  • examples
  • /p/, /b/, /m/Upper lip, lower lip, and jaw work
    together to close the lips.
  • /a/, /o/Tongue body and jaw work together to
    position and shape the tongue dorsum (surface)
    for the vowel.

13
Articulatory Phonology Catherine Browman and
Louis Goldstein
  • Speech can be described with a unitary structure
    that captures both phonological and physical
    properties.
  • Act of speaking can be decomposed into atomic
    units, or gestures.
  • Units of information Linguistic primitives of
    speech production
  • Units of action Dynamically-controlled
    constriction actions of distinct vocal tract
    organs (e.g., lips, tongue tip, tongue body,
    velum, glottis)
  • Coordinated into larger molecular structures

14
Four Aspects of Speech Timing
  • Intragestural variations of temporal patterns of
    individual gestures
  • Ex. Temporal asymmetry of velocity profiles
  • Intergestural relative phasing among gestures
  • Sequencing and partial temporal overlap
    (coproduction) of vowel and consonant gestures in
    the word (and syllable) /pub/
  • Transgestural modulations of temporal patterns
    of all active gestures during a relatively
    localized portion of an utterance
  • Ex. Temporally localized slowing of all gestures
    in neighborhood of phrasal boundaries
  • Global temporal pattern of entire utterance
  • Ex. Overall speaking rate or style

15
Overview Hybrid Dynamical Model
  • Modeling dynamics of speech production a hybrid
    dynamical model
  • 2 components
  • Task-dynamic component shapes articulatory
    trajectories given gestural timing information as
    input. Uses tract-variable and model articulator
    coordinates.
  • Recurrent neural network provides a dynamics of
    gestural timing. Uses activation coordinates.

16
Tract Variable Model Articulator Coordinates
17
Gestural Activation
  • A gestures dynamics influence vocal tract
    activity for a discrete interval of time.
  • Activations wax and wane gradually at edges.
  • A gestures strength is defined by its activation
    level (range 0-1)

bad
time
18
Gestures as Dynamical Systems
  • Gestural activations are used to define
    gesture-specific control dynamics in goal/task
    space coordinates
  • point attractor dynamics of damped mass-spring
    systems in the task-space
  • constriction space (tract variables) closing the
    lips, raising the tongue tip, etc.
  • constriction target is approached regardless of
    initial conditions or perturbations along the way

19
Gestural Equation of Motion
Total gestural acceleration is the sum of the
constriction gesture and neutral gesture
acceleration components.
Constriction gesture
Neutral gesture (governs return to neutral
posture)
20
Hybrid Model Three Coordinate Systems
21
Hybrid Dynamical Model Overall Structure
22
Part 2 Intergestural Timing, System Graphs, and
Syllable Structure
  • Phenomenon Vowel and consonant gestures within
    syllables show characteristic signatures of
    relative timing/phasing
  • We hypothesized that these different patterns
    were due to corresponding differences in
    intergestural coupling graphs
  • coupling graphs were implemented in simulations
  • simulations were compared with actual data

23
Syllable Structure Some Definitions
  • The vowel and consonant gestures in a syllable
    can be partitioned in three componentsOnset,
    Nucleus, Coda

24
Relative Timing in Syllables
  • There is an asymmetry in patterns of relative
    timing displayed within syllable-initial (onset)
    and syllable-final (coda) consonant clusters
  • C-center effect on mean values of intergestural
    relative phase
  • c-center pattern occurs syllable-initially in
    onsets but not syllable- finally in codas
  • Browman Goldstein (1988), Byrd (1995)
  • Stability of relative phasing
  • Greater stability (lower standard deviation) of
    relative phasing occurs syllable initially in
    onsets than syllable-finally in codas
  • Byrd (1996), Cho (2001)
  • Both effects are hypothesized to emerge from
    appropriate dynamic coordination of gestures
    viewed in a oscillatory framework

25
C-center Effect in Onsets, not Codas
Hypothetical Model
C-center
If add an additional coordination (C-C phasing)?
But C-V phasing is preserved as global
c-center-to-V coordination
CV and CC phasings in competition
C-C phasing separates CC in timing
C-V phasing
26
Why C-center Effect in Onsets and not Codas?
  • Browman Goldstein (2000)s Hypothesis
  • there are different coupling structures (system
    graphs) for onsets (C1,oC2,oV) and codas
    (VC1,cC2,c)
  • there is C1,o-V coupling in onsets, but there is
    no V-C2,c coordination (coupling) in codas
  • as a result, there is competition betweenVC and
    CC phasings for onsets, but not for codas

27
Proposed Coupling Graphs CCV vs. VCC
  • CCV
  • C1 C2 V
  • VCC
  • V C1 C2

Competitive coupling structure
No V-C2 coordination No competition
28
Stability of Relative Phasing
  • Browman Goldstein (2000) additionally
    hypothesized that
  • Competitive coupling structures in syllable
    initial position may also help explain the
    greater stability of intergestural phasing in
    onsets than in codas

29
Outline of Simulation Experiments
  • C-center effect in CCV but not VCC?
  • Greater stability (lower variability) between
    consonants in CCV than VCC?
  • Effect of syllable boundary in heterosyllabic CC
    sequences

30
What do Oscillators Have to do with Speech?
  • Oscillatory units have a well defined variable
    representing timephase
  • dynamics of coupled limit cycle oscillators
    allows their relative timing to emerge in a
    self-organized manner due to intrinsic oscillator
    dynamics and the nature of the coupling.
  • the best developed theories of inter-unit timing
    come from work in (non-speech) rhythmic movement

31
What do oscillators have to do with speech?
(cont.)
  • Phase has also been adopted as a measure of
    intrinsic gestural time in speech gestures
    (Browman Goldstein, Kröger, et al.)
  • although point attractor models have been used to
    model these gestures, intrinsic gestural phase
    has been defined relative to an associated
    abstract, underlying gestural oscillator
  • Previously, the coordination of gestures in terms
    of their relative phase has been specified by
    hand in models of word production
  • we have been pursuing a model of speech timing
    that allows relative phasing to self-organize as
    it does in oscillatory systems

32
Task-dynamics of Intergestural Phasing
  • We assume that rhythmic and non-rhythmic speech
    behavior have a common underlying dynamical
    organization
  • here, we attempt to reconcile work in coupled
    oscillator dynamics and intergestural timing in
    speech.
  • Saltzman Byrd (2000) implemented a task-dynamic
    approach to controlling (generalized) relative
    phase and (mn) frequency ratio in a single pair
    of coupled nonlinear oscillators
  • For a pair of oscillators in 11 frequency
    locking
  • the component oscillators must be coupled to one
    another in a manner specific to the desired
    relative phasing
  • We have generalized the Saltzman Byrd (2000)
    model to implement intergestural coupling among
    multiple (gt2) gestures (Nam, Saltzman,
    Goldstein, 2003)

33
Control of Relative Phase General Approach
  • Intergestural coupling is defined in a pairwise
    manner among a set of oscillators in three steps
  • 1stdefine set of task space potential functions,
    V(y),
  • state-variable represents relative phase (? øi
    øj)
  • point minimum corresponds to desired relative
    phase value, y0
  • 2nddefine corresponding task-space (relative
    phase) dynamics
  • 3rdtransform these dynamics into the required
    coupling forces between the component oscillators
  • see Saltzman Byrd (2000) for details

34
Simulation Experiment 1 C-center effect in CCV
Competition
C-centers
  • Target relative phase
  • C1-V 50?
  • C2-V 50?
  • C1-C2 30?

C1
C1
V
C2
C2
  • Resultant rel. phase(Final output)
  • C1-V 59.94?
  • C2-V 39.96?
  • C1-C2 19.98?

Mean of c-centers
C1
C-center effect
V
C2
35
Simulation Experiment 1 No C-center effect in
VCC
No competition
C-center
  • Target relative phase
  • V-C1 50?
  • V-C2 none
  • C1-C2 30?

C1
C1
V
C2
Mean of c-centers
  • Resultant rel phase(Final output)
  • V-C1 49.96?
  • V-C2 79.90?
  • C1-C2 29.94?

C1
No c-center effect
V
C2
36
Adding noiseSimulation Experiment 2
  • Source of noise
  • slight differences in frequencies of oscillators
    (detuning)
  • Noise modeled by adding a linear function to the
    potential energy function
  • V (?) -a cos (? - ?0) b (? - ?0)
  • b represents the amount of inter-oscillator
    detuning,
  • which perturbs the location of potential
    minimum
  • b randomly varied across simulations trials
    within conditions defined by a given standard
    deviation
  • standard deviation of b manipulated across
    simulation conditions

37
Results Simulation Experiment 2
  • Interconsonant phasing is more variable in
    syllable-final position

std. of CC phase (radian)
1.0
Onsets
Codas
std. of detuning b
.05
.65
.25
.45
.85
  • Browman Goldsteins hypothesis proved correct
  • Onsets in competition show greater stability

38
Simulation Experiment 3 Generalizing the Model
to Hetero-Syllabic Consonant Sequences
e.g. a scab e.g. mask amp e.g. bag sab
  • V C C V
  • V C C V
  • V C C V

39
Results Simulation Experiment 3
  • C-to-C phasing is more variable across boundaries

std. of CC phase (radian)
Onsets
1.0
Codas
X-bound
std. of detuning b
.05
.65
.25
.45
.85
  • The result (VCCV lt VCCV lt VCCV) corresponds to
    Byrd (1994)s findings

40
Conclusion Importance of System Graph
  • Dynamic structure (system graphs coupling
    structure) generates observed phonetic
    asymmetries of intergestural phasing (mean
    patterns and their stability)
  • C-center effect
  • mean relative phasing
  • Greater temporal stability

Competitive coupling structure in onset
Consonants not directly coupled across boundaries
  • Effect of boundaries
  • (Greater variability)

41
Future Directions Where are the Underlying
Oscillators?
  • Hypothesis Underlying oscillators live at the
    state-unit level of the hybrid models recurrent
    network as members of an entrained oscillatory
    ensemble
  • Question Is there a 11 association between
    oscillators and gestures?
  • Question How are the mappings learned between
    oscillators and gestural activations?

42
Part 3 Transgestural Effects of Phrasal
Boundaries
  • It has been shown that prosodic boundaries induce
    temporally local contextual variation in ongoing
    articulation
  • prosodic boundaries are boundaries between words
    and higher order phrases in speech
  • Boundary effects on articulation include
  • lengthening of gestural durations
  • decreased overlap (coarticulation) between
    adjacent gestures
  • spatially larger gestures in phrase-initial
    positions
  • Boundary effects appear to be graded
  • stronger boundaries induce greater lengthening

43
Boundary Adjacent Slowing
  • It has been shown that speech gestural durations
    lengthen in the region of word and phrase
    boundaries
  • It also appears that stronger boundaries induce
    greater lengthening
  • Example (Byrd Saltzman 1998)

44
Boundary Adjacent Slowing(Byrd Saltzman 1998)
45
Boundary Adjacent Slowing(Byrd Saltzman 1998)
Speaker J
mmi
none
word
pre-boundary lip opening duration
list
vocative
post-boundary lip closing duration
Boundary Type
utterance
Speaker K
none
word
list
vocative
utterance
0
100
200
300
(ms)
46
Boundary Adjacent Relative Timing
  • Additionally, evidence exists suggesting that
    phrase boundaries affects the relative timing
    (i.e. overlap) between gestures.
  • Chitoran, Goldstein Byrd (to appear), Byrd
    (1996), Hardcastle, (1985), Byrd, Kaun,
    Narayanan, Saltzman, (2000), Jun (1993),
    Keating et al. (in press)

Time between displacement extrema in CC
.
70
47
Approach Prosodic (p)-gestures
  • Question How can we account for the variations
    of gestural timing associated with prosodic
    context?
  • p-gestures (prosodic gestures) influence the
    expression of all constriction gestures which are
    concurrently active with the p-gestures
  • Transgestural effect
  • Effect in proportion to the activation level of
    the p-gesture.
  • p-gesture activation determined by boundary
    strength.

Byrd, Kaun, Naryanan, Saltzman (2000), Byrd
(2000), Byrd Saltzman (subm)
48
Two constrictions spanning a phrase boundary
49
How is this Prosodic Action Effected?Parameter
Dynamics Stiffness Lowering
  • Lowering of gestural stiffness values has been
    hypothesized to underlie gestural lengthening
    adjacent to phrasal boundaries.
  • Beckman et al. 1992, Byrd Saltzman 1997
  • Local, transgestural on-line modulation of
    gestural parameter values.
  • E.g. Locally lower stiffness local
    slowing

50
But...
  • Changes in both duration and relative timing
    occur at phrase boundaries.
  • Stiffness scaling does not account for changes in
    relative timing.
  • modulates point-attractor parameter values, but
    does not specifically influence the domain of
    gestural activation.

51
How is this Prosodic Action Effected?Central
Clock Slowing
  • Hypothesis Prosodic effects are induced by time
    slowing at the gestural control level.
  • slowing the timecourse of gestural activation
    (Byrd Saltzman, subm)
  • Slowing the central clock has both intragestural
    and intergestural timing consequences.
  • Related Work V.-Bateson, Hirayama, Honda,
    Kawato, 1992 Bailly, Laboissière, Schwarz,
    1991 ODell Nieminen, 1999 and especially,
    Port Cummins, 1992, and Barbosa Bailly, 1994

52
Gestural Activation
53
Slowing Activation Timecourse
Stretched with time slowing
1
0.5
No time slowing
0
0
0.05
0.1
0.15
0.2
0.25
Equation for time scaling/stretching/slowing
  • ? is scaled time,
  • t is unscaled time whose flowrate 1, and
  • a(t ), gestural activations (constriction and
    p-gestures), are functions of scaled time.

54
Simulation data No p-gesture
1
GESTURE 1
GESTURE 2
Activation
0.5
0
0
0.05
0.1
0.15
0.2
0.25
1
0.5
Position
0
-0.5
-1
0
0.05
0.1
0.15
0.2
0.25
gesture 2 duration
1
0.5
Velocity
0
-0.5
gesture 1 duration
-1
0
0.05
0.1
0.15
0.2
0.25
55
Simulation p-gesture realized via clock slowing
Activation (faint unslowed bold slowed)
1
GESTURE 1 (phrase-final)
GESTURE 2 (phrase-initial)
0
.
5
p-gesture
0
0
0
.
0
5
0
.
1
0
.
1
5
0
.
2
0
.
2
5
Position (faint unslowed bold slowed)
1
0
.
5
0
-
0
.
5
-
1
0
0
.
0
5
0
.
1
0
.
1
5
0
.
2
0
.
2
5
56
Initial Strengthening
  • Initial strengthening apparently spatially
    larger gestures in phrase-initial positions.
  • E.g., more linguapalatal contact in lingual
    consonants longer linguapalatal seal durations
    longer VOTs (Keating, Jun, Fougeron, Cho, Hsu,
    others) more breathy hs (Pierrehumbert
    Talkin, 1992) more lip rounding in rounded
    vowels (van Lieshout et al., 1995)
  • BUT what is the articulatory foundation for these
    very different types of effects?

Can we unite slowing, lesser overlap, and
strengthening in terms of articulatory
dynamicsspecifically clock slowing??
57
Simulation Clock slowing withtwo (same
constriction) phrase-initial gestures
Gesture1closing (e.g. lingual
C) Gesture2opening (e.g. following
V) Gesture1 duration Gesture2 duration Time
between peak velocities Spatial strengthening
(phrase initial)
Activation (faint unslowed bold slowed)
gest 1 (consonant)
gest 2 (vowel)
1
0.5
p-gesture
0
0
0.05
0.1
0.15
0.2
0.25
Position (faint unslowed bold slowed)
2
Refererence line for plausible linguapalatal
contact
1
0
-1
-2
0
0.05
0.1
0.15
0.2
0.25
58
Summary p-gestures
  • Local slowing of a central clock appears to be a
    plausible way to capture prosodically driven
    shaping of articulatory behavior.
  • Unlike stiffness modulation which only affects
    gestural durations, clock rate modulation
    generates several experimentally observed
    prosodic effects
  • gestural lengthening
  • reduced intergestural overlap
  • spatial strengthening

59
Theoretical Implications of Prosodic-Gestures
  • First step in conceiving a dynamical
    implementation of phrasal structure.
  • Just like articulatory gestures, phrasal
    junctures are viewed as
  • Having inherent durational properties
  • Being temporally coordinated with other gestures
  • Provides a theoretical reconciliation of what in
    the past has been an inconsistency in the manner
    in which prosodic structure and segmental
    structure have been conceptualized in
    Articulatory Phonology (Browman Goldstein, 1992
    and elsewhere).

60
Part 4 Anticipatory Behavior of Speech Gestures
  • Question
  • When does gestural motion begin relative to its
    required time of target attainment in an
    utterance?
  • Answer Controversial
  • Look-ahead modelas early as possible given no
    other conflicting demands
  • Frame modeltime-locked to the time of target
    attainment

61
Intragestural Effects Gestural Anticipation
Intervals
  • Intragestural shaping of gestural anticipation
    intervals
  • Self-organization of gestural onsets given
    required times of target attainment
  • Emergent behavior from a bidirectionally coupled
    set of dynamical systems
  • Activation dynamics (recurrent neural network)
  • Primary responsibility shaping gestural
    activation patterns
  • Acts as sequence-specific central controller
    (clock, c.p.g.)
  • drives task-dynamic model (feedforward)
  • Interarticulator coordination dynamics (task
    dynamics)
  • Primary responsibility shaping articulator
    trajectories
  • Ongoing state modulates recurrent controller
    (feedback)

62
Architecture of a Simple Hybrid Model


task-dynamic elements
sequential network elements


inter-element synapses
label delay lines



numbers symbols fixed weights assigned to some
synapses.
63
Network Training Side Constraints Interval
Types
  • Network training/programming.
  • backprogagation-in-time distal supervised
    learning
  • Two constraint types during training
  • Task constraints specific to current task,
    e.g., reach target at a specified time
  • Side constraints generic constraints, e.g.,
    maximize smoothness, minimize effort, etc.
  • Two types of training interval
  • Care task and side constraints
  • Dont care only side constraints
  • We used a side constraint that minimized gestural
    activation.

64
Anticipatory Behavior Effect of Side Constraints
Care
Don't care
interval
interval
Activation
level
Tract variable
position
  • Left column Look-ahead behavior occurs when
    side constraints are absent, and gestural onset
    occurs near the beginning of the dont-care
    interval, regardless of its length.
  • Right column Frame model behavior occurs when
    side constraints are present, regardless of the
    dont-care intervals length, and gestural
    onsets are approximately time-locked to the
    care interval.

65
Constrained Temporal Elasticity in Speech
  • Data on anticipatory lip-protrusion in French
    speakers (e.g., Abry Lallouache, 1995) suggests
    that anticipatory behavior may be neither rigidly
    time-locked nor totally unconstrained. This
    suggests a constrained temporal elasticity,
    intermediate between these two extremes.
  • Abry Lallouaches Movement Expansion Model,
    i.e., a gestures anticipatory interval lengthens
    as the preceding dont care interval lengthens,
    but only fractionally. Different speakers show
    different lengthening fractions.
  • We generated temporally elastic behavior using
    intermediate values of side-constraints.

66
Constrained Elasticity in the Hybrid Network
67
Constrained Elasticity Lengthening Fractions
68
  • Thank you
Write a Comment
User Comments (0)
About PowerShow.com