Applications of Embodied Interaction - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Applications of Embodied Interaction

Description:

A virtual training environment where live and virtual people can mutually interact. ... A robotic chandelier capable of creating and maintaining complex ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 26
Provided by: peopl88
Category:

less

Transcript and Presenter's Notes

Title: Applications of Embodied Interaction


1
Applications of Embodied Interaction
  • Presented By Ajit Kulkarni
  • 28th September 2006
  • CS6724

2
Papers for today
  • Badler, N., "LiveActor A virtual training
    environment with reactiveembodied agents," in
    Workshop on Intelligent Human Augmentation and
    VirtualEnvironments. 2002. Univerity of North
    Carolina at Chapel Hill.
  • J. Juster, D. Roy, "Elvis Situated Speech and
    Gesture Understanding for aRobotic Chandelier,"
    Proceedings of the ACM International Conference
    onMultimodal Interfaces, pp. 90-96
  • Michael Katzenmaier, Rainer Stiefelhagen, Tanja
    Schultz, "Identifying theAddressee in
    HumanHumanRobot Interactions based on Head Pose
    and Speech,"ICMI04, October 1315, 2004, State
    College, Pennsylvania, USA, pp. 144-151.
  • Louis-Philippe Morency, and Trevor Darrell, "From
    Conversational Tooltips toGrounded Discourse
    Head Pose Tracking in Interactive Dialog
    Systems," ICMI04,October 1315, 2004, State
    College, Pennsylvania, USA, pp. 32-37.

3
Paper 1 Live Actor
  • A virtual training environment where live and
    virtual people can mutually interact.
  • Strong coupling between language and animation
  • Sensitize users to nuances of facial and gesture
    generation
  • Users state known to virtual agent
  • Application- training for military and rescue
    personnel

4
Virtual Reality systems
  • Real Time training would be a major application
    of VR
  • Realistic virtual human models required
  • EMOTE gesture engine, includes arm ,torso and now
    face
  • Bodys communication channels reflect an agents
    emotional state.

5
Parameterized Action Representation PAR
  • PAR allows agent to act , plan and reason about
    its actions or actions of others
  • PAR designed for building future behaviors into
    autonomous agents
  • PAR can also control animation parameters that
    control an agents internal state

6
Rapid Scenario Authoring
  • Enabled by PAR
  • PAR has Actionary- database of agents, objects
    and actions
  • Uninstantiated PAR- definition of an action
  • Instantiated PAR bound to agent and contains
    specific information about agent,objects
  • Actionary is a database of actions
  • PAR definitions are context sensitive and
    dynamically react to current state of the world
  • PAR contains parameters for specification of
    agent traits

7
Reading Trainees Interactions
  • Participants will react differently to situations
  • Focuses on providing non verbal cues for trainee
    to virtual players
  • This done by EMOTE parameters which provide
    information to the virtual player
  • Language interfaces for agent interactions
  • Language used to compressing activity reports

8
Paper 2 Elvis
  • A robotic chandelier capable of creating and
    maintaining complex lighting environments
  • System has a target goal state which it tries to
    maintain by monitoring environment
  • Uses closed loop feedback to maintain state
  • Changes in state trigger action in robot which
    tries to compensate and return to original state
  • User can change target goal state through speech
    and gesture
  • If environment changes just retrain the system

9
Learning and Training
  • Uses direct inverse modeling
  • Capture image with all cameras turned off
  • Resample camera image with 1 light turned on
  • Get difference map tied to position and intensity
    of light
  • Repeat process for 8 motors (2 per light)
  • Get sensorimotor contingency table

10
Goals and their maintenance
  • Simple Goal when changing the environment
  • Each pixel represents desired change at a point
  • Composite Goal
  • Move a light to left
  • Remove light from first position
  • Add it in second position
  • Goal maintenance
  • Initial map is target goal state
  • Take new snapshot every second world state
  • If significant difference between the two
    manipulate world state
  • Intermediate difference state created

11
Goal maintenance
12
Goal Shifting
13
Speech Analysis
14
Gesture Analysis
  • User first detected by capturing skin
  • Needed for more information with speech
  • Generally are pointing, circles or contours
  • Gestures recorded and then strokes weighted by
    probability

15
Example
16
Paper 3 Human Human Robot Interactions
  • Automatically determine whether a human addresses
    a robot or another human
  • Interpret human speech, gaze and gestures
  • 3 approaches used
  • Purely acoustic cues
  • Humans head pose
  • Combine both of above

17
Relation between visual target and addressee
  • Acoustic target- whom did the speaker talk to?
  • Visual target whom did the speaker look at ?
  • Results
  • Guest looked at when addressed 99.5 of the time
  • Robot looked at when addressed 95 of the time
  • Robot looked at but not addressed 35 of the time
  • Head pose estimation
  • 2 neural networks used for pan and tilt
  • i/p heads intensity and grayscale images o/p
    rotation angles
  • Estimation of addressee using a-posteriori
    probability
  • Head pose identifies visual target 93 of the
    time
  • Acoustic addressee identified 89 of the time

18
Identification using Speech
  • Identify likely addressee based on features
    extracted from speech signal
  • Distinguish between command and conversation
  • Commands-shorter and more imperative
  • Context free grammar used for parsing data and
    detecting words
  • Accuracy German-0.87 and English -0.49
  • Weighted Combination of speech and head
    orientation improves classification

19
Paper 4 Head Pose Tracking
  • Head pose and gesture cues
  • About the system
  • Face processing system serves as a conversational
    grounding module in conversational dialog systems
  • Automatically initializes to new users
  • Builds on a user specific model to perform stable
    tracking
  • Made in tool kit form

20
A Visual Grounding Module
  • Automatic initialization
  • User independence
  • Robustness to different environments
  • Sufficient sensitivity to recognize subtle
    gestures
  • Real time processing
  • Stability over a long period of time

21
Head Pose Tracking
  • Use stereo camera for depth information
  • Motion based tracking detects small movements
    accurately
  • Use adaptive view based appearance model
  • Possible to track position, orientation and
    velocity of head with good accuracy over a long
    period

22
Head Gesture Recognition
  • o/p velocity of head pose tracker is i/p to
    gesture detector
  • Trained detector using Hidden Markov Models
  • Consider head rotational velocity data
  • We get users head position , orientation and
    also head nods and shakes

23
Visual Tool Tips
  • Users attention estimated from head gaze
    estimate
  • 3 steps deictic gesture , tool tip and answer
  • Look at users head orientation for his focus
  • Nod or shake of users head for answer

24
Face to Face Grounding
  • Embodied conversational agent-MACK
  • Use head nods and gaze to give information on map
    to user
  • Robot -Mel
  • Estimate head gaze for accurately and detect head
    nods
  • Head nods detected and used by system to know
    that user is engaged in conversation
  • Explain in more detail

25
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com