A POMDP Approach to Affective Dialogue Management

About This Presentation

Title:

A POMDP Approach to Affective Dialogue Management

Description:

Vietri sul Mare, 10 September 2006. INTERNATIONAL SCHOOL 'NEURAL NETWORKS E. R. CAIANIELLO' ... The Fundamentals of Verbal and Non-verbal Communication and the ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 23

Provided by: Bui7

more less

Transcript and Presenter's Notes

Title: A POMDP Approach to Affective Dialogue Management

1
A POMDP Approach to Affective Dialogue Management

Trung H. Bui
Mannes Poel
Anton Nijholt
Job Zwiers
University of Twente
Vietri sul Mare, 10 September 2006
INTERNATIONAL SCHOOL NEURAL NETWORKS E. R.
CAIANIELLO"
XI COURSE on
The Fundamentals of Verbal and Non-verbal
Communication and the Biometrical Issue

2
Outline

Motivation
MDP/POMDP dialogue management
Affective dialogue modeling
Example
Conclusions and future work

3
Motivation

Affective dialogue management (ADM) model is a
dialogue management which is able to take into
account some aspects of the emotional state and
acts appropriately
Scope of the ADM we are focusing on
human-computer interaction using multimodal
input/output
acting appropriately given knowing the users
emotional state and the user action with
uncertainty (not emotion recognition,
dialogue-act recognition)
POMDP provides an elegant framework for this type
of dialogue models

4
Markov Decision Process (Howard-1960)
Agent (frog)
Environment (lily pond)
Actions (jump, look)
Rewards (jump successfully ? 10, else -10, look
-1)
States (lily pads)
5
Partially Observable Markov Decision Process
Agent (frog)
Environment (fog shrouded lily pond)
Action set (A) (jump, look)
Observations
Reward model (R) (jump successfully ? 10, else
-10, look -1)
States (lily pads)
6
Partially Observable Markov Decision Process

ltS, A,Z,T,O,Rgt
Related notations
b is the agents belief state b
? is the agents policy to select the action
Two main tasks
Computing the belief state
Finding the optimal policy

T transition model O observation model R
reward model
S state set A action set Z observation set
7
Example (Roy et al. 2000)

ltS, A,Z,T,O,Rgt

output of ASR
8
Computing the belief state
observation model
transition model
old belief
new belief
normalizing constant
Example S s1,s2, Aa1,a2, Zz1,z2,z3
bt(s1)
P(s1)
bt1(s1) given ata1,st1s1,ot1z1
9
Finding the optimal policy
V?(b)

V?(b) expected total discounted future reward
starting from b for a policy ?
? is the discount factor
The optimal policy

a1
a2
P(s1)
b
10
POMDP Dialogue management
Focus on spoken dialogue management, noisy
environment
11
Proposed POMDP Affective dialogue model

Using the factored POMDP
State set and observation set are composed of 6
features
State set users goal (Gu), users affective
state (Eu), users action (Au), users dialogue
state (Du)
Observation set observed users action (OAu)
observed users affective state (OEu)

12
Transition model observation model

No data available ? Use parameters
pgc pec are the probability the users goal
affective change
pe is the probability of the users action error
being induced by emotion
poa poe are the probabilities of the observed
action observed affective state errors
Partial or full data available ? construct and
adjust the model from the collected data

13
Example Simulated Route navigation in the unsafe
tunnel
rd-b
rd-a
a
b
c
rd-c
14
Model specification

State space (including an absorbing end state)
Gu a,b,c
Eu stress, no-stress
Au a,b,c,yes,no
Du 1location-specified, 2location-not-specifi
ed
System action
A ask, confirm-a, confirm-b, confirm-c, rd-a,
rd-b,rd-c,fail
Observation
OEu stress, nostress
OAu a,b,c,yes,no
Reward
Confirms before the location is specified ?
reward -2
Fail action ? reward -5
rd-x with gux ? 10 otherwise -10
The reward for any action taken in end state is 0
The reward for other action is -1

(rd means give route description)
15
Possible dialogue strategies
a
ask ? rd-a ask ? confirm-a ? rd-a ask ? ask ?
confirm-a ? rd-a
a
yes
a
a
yes
Some of them are useful. Which ones are optimal?
16
Optimal policy (Using Standard PBVI Algorithm
27.83s)

Test case
Reformulated model

17
Value function table
the optimal the action should start given the
initial belief
18
Expected return vs. users action error being
induced by stress (pe)
No observation error
Low observation error
High observation error
Test results were carried out using Perseus
algorithm on full POMDP model (61 states, 8
actions, 10 observations)
19
Comparing the result (using the simulated user)
20
Conclusions

The optimal dialogue strategy depends on the
correlation between the users affective state
action
The factored POMDP allows integrating the
features of states, actions, observations in a
flexible way
But!!!
Computational complexity in finding the optimal
policy using both exact and some approximate
algorithms except small, toy dialogue problems
Recent advances in approximate POMDP techniques
plus heuristics in dialogue model design are
expected to solve real-world dialogue applications

21
Future work

Scaling up the model with larger state, action,
observation sets for real-world dialogue
management problems
Extending the model representation, e.g.
correlations between users emotion goal
Collecting generating both real artificial
data to build and train the model

Thank you

Write a Comment

User Comments (0)