Ashish Vaswani - PowerPoint PPT Presentation

About This Presentation

Title:

Ashish Vaswani

Description:

Mental state consists of beliefs and wants. ... How can one give precise definitions of speech acts using mental state and action? ... – PowerPoint PPT presentation

Number of Views:56

Avg rating:3.0/5.0

Slides: 55

Provided by: ashishv

Learn more at: https://people.ict.usc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Ashish Vaswani

1
Ashish Vaswani

Speech acts for Dialogue agents, Coding schemes
and dialogue act taxonomies

2
Speech acts for dialogue agents (Traum)

Talks about the role of speech acts in allowing
an agent to participate in dialogue with another
agent
A dialogue agent is one that can interact and
communicate with other agents in a coherent
manner, not just with one-shot messages but with
a sequence of related messages all on the same
topic or In the service of an overall goal.
In studying speech acts, the focus is on
pragmatics rather than semantics i.e how is
language used by agents, and not what the
sentences mean.

3
Foundational Philosophical speech act work

Began with philosophers of language interested in
issues in Natural language pragmatics
Austin
Utterance are used to do things
Under favorable conditions, utterances can change
the mental and interactional state of the
participants.
Speaking is acting
Three main divisions of speech acts.
Locutionary act Act of saying something.
Illocutionary act The act performed in saying
something. (viz, informing, warning etc.)
Composed of illocutionary force and propositional
content.
Indirect speech acts (Could you please pass the
salt ?)
Perlocutionary act The effect of the utterance
on the speaker (viz. persuasion, surprise etc.)
Classified illocutionary acts into several
categories based on illocutionary force
(verdictives, exercitives, commissives,
expositives and behavitives)

4
Speech act work continued

Searle
Extended and refined Austins work on
illocutionary acts.
No necessary correspondence between illocutionary
acts and illocutionary verbs that a language
chooses to describe these acts.
Searle pointed out 13 different dimensions along
which speech acts could vary suggesting an
alternate taxonomy on purpose (his first
dimension)
Searles Taxonomy
Representatives
Directives
Commissives
Expressives
Declarations

5
AI models of speech acts

Problem with early speech act work was that there
did not exist formal accounts of actions and
mental states that could be used to design more
precise definitions of speech acts.
Bruce First one to account of Speech Act Theory
in terms of actions and plans (AI)
Natural language generation is Social Action.
(beliefs, desires and wants)
Inform and request could be used in achieving
intentions to change states of belief.

6
AI models of speech acts

Cohen and Perrault
Defined speech acts as plan operators that change
the beliefs of the speaker and hearer
Enumerated goals for an account of speech acts
A plan based theory of speech acts should specify
a planning system and a definition of speech acts
as operators in the system
Mental state consists of beliefs and wants.
They used a modified version of the STRIPS
planning system
cando preconditions and want preconditions for
operators
They modeled REQUEST and INFORM within their
system

7
AI models of speech acts

Allen and Perrault
Used the same formalism as Cohen and Perrault
Recognizing other agents plans important for
interpreting utterances.
Hinkelman use linguistic cues to build partial
speech act templates and plan inference for
utterance hypothesis

8
AI models

Perrault (non monotonic theory of speech acts)
Utterance itself is insufficient to determine the
effects of a speech act (prior context, mental
state of agent, actual utterance)
Stated the effects in terms of default logic.
Dynamic logic approaches
Cohen and Levesque showed how effects of
illocutionary acts can be derived from general
principle of rational cooperative interaction
(sincerity and helpfulness)
Recognizing illocutionary force of an utterance
is not necessary, only cooperation.
Sadek uses a similar logic of rational action.

9
Extending speech acts to Dialogue (Dialogue
function as action)

Litman and Allen
Extend Allen and Perraults work to include
dialogues and hierarchy of plans
Domain plans, discourse plans meta plans
Carberry and Lambert add problem solving plans to
domain plans and discourse plans.
Cohen and Levesque extend their work into a
theory of joint intention and multi agent action
Why confirmations appear in dialogue. (belief of
object of intention)

10
Multiple levels of interaction

Attempts to model different kinds of dialogue
phenomena at different strata. (from sentence
level and upwards)
One early classification
(transactions(exchanges(moves(acts))))
Moves speech acts towards a particular purpose
The exchange structure was also called a dialogue
game
In Traum and Hinkelman, there were levels of acts
rather than ranks

11
Speech act based communicative languages

Language based on Speech acts would itself be a
good agent communication language
KQML (knowledge query and manipulation language)
Each message has an identifier (kind of action)
and other parameters specifying content. Based on
Austins performatives.
Problems with hidden speech acts.

12
Speech Acts in multi agent action theory

The main effects of speech acts are on the mental
and interactional states of the participants.
(BDI attitudes)
Social attitudes
We must also consider social attitudes (question
Are social attitudes basic ?)
Mutual belief (Harman) A group of people have
mutual knowledge of p if each knows p and we know
this where this refers to the whole fact known.
Mutual belief is achieved through the process of
grounding. (Clark and Schafer)
Obligations are necessary for modeling social
situations (viz. a hearer is obligated to answer
a question if posed one). What an agent should
do.
Problem How do you decide social norms?
Obligations might conflict with the agents goals
and he might choose to violate them (e.g,
interrogation)
Another social attitude is joint intention or
shared plan. Coordinated team activity depends on
more than only individual intentions and beliefs.
(how do shared intentions guide individual action
?)

13
Speech acts in multi agent action theory

Defining speech acts
How can one give precise definitions of speech
acts using mental state and action?
How can one recognize whether such an act has
been performed? (because of involvement of mental
states, an observer might not be able to tell)
How can agents plan to use speech acts to
accomplish their goals?
Traum Plan recipe for communication

14
continued

Planning speech acts
Acts can be planned as games, or single moves.
How far ahead should an agent plan?
The future actions of agents are inaccurate.
Negotiations, arguments (more planning), casual
conversation (no planning)
Recognizing speech acts
Combination of input utterance with aspects of
current context to decide what acts have been
performed (for example, current context says that
an INFORM act might be impending)
Should the agent just recognize the acts or the
intentions also (This might be necessary for
interpreting indirect speech acts)
How much of the plan should be inferred? Deep
intention recognition might not be necessary
instead considering all possible actions and
their immediate effects is sufficient when
combined with facility to repair erroneous
conclusions. (default logic?) (McRoy)
Grounding relaxes the need for intention
recognition since it can help in realizing
motivations as the speaker is easily accessible.

15
The reliability of a Dialogue structure coding
scheme (Carletta et al)

Paper aims at introducing and describing the
reliability of a scheme of dialogue coding
distinctions for a Map task corpus
In the Map Task, two participants have slightly
different versions of a simple map with
approximately fifteen landmarks on it. One
participant's map has a route printed on it the
task is for the other participant to duplicate
the route.
The moves introduced is independent of the task.
They attempt to classify dialogue structure at
higher level also (Transactions and games)
The dialogue structure can be used with codings
of many other dialogue phenomena.

16
The dialogue structure coding

Transactions
Highest level
Subdialogues that accomplish one major step in
the participants plan for achieving the task.
Size and shape depend on the task
Conversational games (dialogue games)
A conversational game is a set of utterances
starting with an initiation and encompassing all
utterances up until the purpose of the game has
been either fulfilled (e.g., the requested
information has been transferred) or abandoned.
Games can nest within each other
Games are made up of Conversational moves which
are different kinds of initiations and responses

17
The move coding scheme
18
The move coding scheme (moves)

Instruct move
move commands the partner to carry out an action.
Expected response could be performance of action
if the participant knows the action.
G Go right round, ehm, until you get to just
above them.
Explain move
States information that has not been directly
elicited by the partner.
Facts about the domain, state of plan or task,
including facts that help establish what is
mutually known
G Where the dead tree is on the other side of
the stream there's farmed land.

19
Move coding scheme

Check move
Requests the partner to confirm information that
the speaker has some reason to believe, but is
not entirely sure about.

20
Move coding scheme

Align move
checks the partner's attention, agreement, or
readiness for the next move.
most common type of ALIGN move is for the
transferer to know that the information has been
successfully transferred, so that they can close
that part of the dialogue and move on.

21
Move coding scheme

Query-YN move
asks the partner any question that takes a yes or
no answer and does not count as a CHECK or an
ALIGN
These questions are most often about what the
partner has on the map
F I've got Dutch Elm.
G Dutch Elm. Is it written underneath the tree?

22
Move coding scheme

The Query-W move
is any query not covered by the other categories
most moves classified as QUERY-W are wh-questions

23
Move coding scheme (Response moves)

Used within games after an initiation and try to
fulfill expectations in the game
Acknowledge move
verbal response that minimally shows that the
speaker has heard the move to which it responds,
and often also demonstrates that the move was
understood and accepted.
only the last three (from Clark and Schafers
evidences for acknowledge) count as ACKNOWLEDGE
moves in this coding scheme
G Ehm, if you ... you're heading southwards.
F Mmhmm.

24
Move coding scheme

Reply- Y move
any reply to any query with a yes-no surface form
that means "yes", however that is expressed
normally only appear after QUERY-YN, ALIGN, and
CHECK moves.
G See the third seagull along?
F Yeah.
Reply N move
reply to a query with a yes-no surface form, that
means "no
G Do you have the west lake, down to your left?
F No.

25
Move coding scheme

Reply W move
any reply to any type of query that doesn't
simply mean "yes" or "no.
G And then below that, what've you got?
F A forest stream.
Clarify move
reply to some kind of question in which the
speaker tells the partner something over and
above what was strictly asked.
Route givers tend to make CLARIFY moves when the
route follower seems unsure of what to do, but
there isn't a specific problem on the agenda

26
Move coding scheme

Other possible responses
Utterances where the responder refuses to share
the same goal as the initiator (No, lets talk
about..)
ACKNOWLEDGE moves with a negative slant
Sufficiently rare in the corpora.
READY move
moves that occur after the close of a dialogue
game and prepare the conversation for a new game
to be initiated.
G Okay. Now go straight down.
Confusion That could have been an acknowledge
move too

27
Coding continued

Game coding scheme
Beginning of new games are coded by purpose
Place where games end or are abandoned are marked
Marked as either occurring at top level or being
embedded in the game structure
Transaction coding scheme
Four transaction types
NORMAL Transaction serving a subtask viz. a
route segment on the map.
REVEW Transactions created when participants
return to parts of the route that have already
been completed
OVERVIEW Overviewing an upcoming segment in
order to provide a context for the partner.
IRRELEVANT Subdialogues not relevant to of the
route (maybe about the experimental setup)
Coding involves marking in the dialogue where the
transaction starts except for IRRELEVANT
transactions.
Ends of transactions are not coded.

28
Reliability of coding scheme

Tests of reliability
Krippendorffs tests of reliability
Stability
Reproducibility
Accuracy
Agreement by coders on segmentation
Used kappa coefficient for reliability of
classification.

29
Reliability of coding

Refliability of move coding
Four coders
Each coder had access to the speech as well as
transcripts
All coders interacted verbally with the
developers
Reliability of move segmentation
Kappa .92 using word boundaries as units
Pairwise percent agreement on locations where any
coder had marked a boundary was 89.
No of units 4079. No of boundaries 796
Most errors were with marking READY separately or
marking it in the move that followed and marking
a reply or a splitting it into a reply and
EXPLAIN, CLARIFY etc.

30
Reliability of coding

Reliability of move classification
Since the reliability of segmentation was good,
it gave a good foundation for move classification
Move classification was evaluated only over move
segments where the boundaries were agreed
Kappa for move coding 0.83
Largest confusions between
CHECK and QUERY-YN
INSTRUCT and CLARIFY
ACKNOWLEDGE, READY and REPLY-Y
K 0.89 for coding with Initiation a command, a
statement or a question

31
Reliability of coding

Reliability of move classification from Written
instructions
K 0.69
Reliability of move coding in Another domain
Transcribed conversation between a hi-fi sales
assistant and a married couple intending to
purchase an amplifier
K 0.95 for move segmentation
K 0.81 for move classification
Reliability of game coding
Pairwise agreement on game beginnings 70
Reliability of Transaction coding
Done from written instructions
K 0.59

32
Coding Dialogues with the DAMSL Annotation scheme
(Mark Core and James F Allen)

DAMSL (Dialogue Act Markup In Several Layers)
Automatic analysis of Dialogue needed for
Computer acting as participant with users
Computer as observer interpreting human speech
DAMSL allows multiple labels in multiple layers
to be applied to an utterance
Communicative actions described here are high
level.

33
DAMSL annotation scheme

Forward communicative functions
Speech acts that affect the future of dialogue
These categories are independent
Divided into
Representatives (statements) Making claims about
the world
Speaker trying to affect the beliefs of the
hearer- Assert
Repeating information for emphasis or
acknowledgement-Reassert
Influencing-Addressee-Future-Action
All utterances that discuss potential actions of
the addressee
Directives
Info Request Questions and Requests (tell me the
time)
Action Directive Requests for action (Please
take out the trash)
Open-Option
Speaker gives a potential course of action but
does not show preference towards it
Commissives (Committing-Speaker-Future-Action)
Offers
Commitments
Perfomative catetory
Utterances that make a fact true in virtue of
their content (You are fired)

34
DAMSL annotation scheme

Backward communicative function
The speech act categories related to responses
The classes are independent
Agreement
Accept, accept-part, Maybe, Reject-part, reject,
hold
Understanding
Did the listener understand the speaker?
The listener may
Signal-non Understanding
Signal understanding (Acknowledgements,
Repeat-Rephrase, completion)
Correct Misspeaking
Answer
Supplying information explicitly requested by a
previous Info-Request act
Information relations
Describe how the information in the current
utterance relates to previous utterances

Utterance features
Information Level
Task (utterance about the task)
Task Management (utterance about the planning and
monitoring of task)
Communication management (Physical requirements
of dialogue)
Other
Communicative Status
Abandoned
Uninterpretable
Syntactic Features
Conventional form (hello, how may I help you)
Exclamatory form (wow)

36
Experiments

Used test dialogues from the TRAINS 91-93
dialogues.
A person was given a problem to solve viz.
shipping box cars to a city and another person
was instructed to act as a problem solving
system.

37
Results

Three statistics were used to measure
interannotator reliability.
PA percent pairwise agreement
PE- Expected pairwise agreement
Kappa (PA-PE)/1-PE

38
Results
39
An emperical investigation of proposals in
Collaborative Dialogues Barbara et al.

They use a slight modification of the DRI
(Discourse resource initiative) scheme.
Task (will be read out)
The DRI coding scheme
Similar and Simpler than the DAMSL scheme
discussed before.
Forward looking functions
This dimension characterizes the potential effect
that an utterance Ui has on the subsequent
dialogue.
Statement Make claims about the world.
Assert (Speaker trying to change Hearers beliefs)
Reassert (if the claim has already been made
before)
Influence on hearer (I-on-H)
Influences Hs future action
Open option
Info Request
Action directives
Influence on Speaker (I-on-S)
Commits S to some future course of action
Offer
commit

40
DRI coding scheme

Backward looking functions
Ui has to do with response
Answer
Agreement
Accept/reject
Holds
Certain refinements were made to the core
features by adding heuristics for tagging
Statements, I-on-H and I-on-S.

41
Coding results

Their results on forward functions were better
than Core and Allens (97)
Very low Kappa value for agreement

42
Twenty questions for Dialogue act taxonomies
(Traum)

Defining dialogue acts
Question 1.
Which is most important fit to intuitions or
formal rigor?
Difficult to precisely formulate complex
intuitions using available formal techniques
Sacrifice intuition for formal rigor or vice
versa?
Answer will depend on the purpose of the concept.
(experimentation or verfication)

43
Question 2 3

Is the definition of a dialogue act an issue of
lexical semantics or ontology of action?
Is defining providing an account when someone
might be justified in describing a sentenced
headed with a particular verb (inform, request),
or to provide a technical vocabulary to compactly
describe various types of occurences? (the speech
acts in the third paper)
Under what conditions may an action said to have
occurred?
Allwood uses 4 criteria
Intention of performer
Form of behavior (eg linguistic form , question
2?)
Achieved result
Context in which the behavior occurs.
Avoid defining DAs according to, say a certain
set of results holding and then identify
instances of these acts using one of the other
criteria say, linguistic form. This would lead to
coding difficulties

44
Question 4 5

What is the role of speaker intention
Some would define dialogue acts on the basis of
intention behind them
Some would define it with the recognition of this
intention (illocutionary acts)
What is the role of addressee uptake
Many dialogue act definitions require some
changes to the addressee based on understanding
of the utterance in a particular way

45
Question 6

What view should be taken regarding the
performance of acts?
Speakers and listeners view
View of the speaker addressee team, normative
conventional point of view.
Is one allowed to consider subsequent utterances
before deciding performance
This has implications while coding.

46
Dialogue act components(questions 7 and 8)

How are actions used in a logic?
What is context?
What aspects of the situation are relevant as
potential conditions for defining types of
dialogue act performance and what aspects are
(directly) affected.
Special sorts of information used for conditions
and effects of dialogue acts
Dialogue state (pre dialogue be in a particular
state, effect transition to a new dialogue
state)
Mental states (effect newly adopted beliefs)
Social obligations and commitments

47
Questions 9 10

What kind of preconditions are appropriate
Most convenient dialogue acts have few, if any
actual preconditions
How should an unsuccessful act be distinguished
from a failed attempt to perform an act?
Difference between the success and satisfaction
of a speech act

48
Relationships and complex acts(question 11 and 12)

What is the relationship between dialogue acts
and other (e.g., physical) acts?
Different theories would maintain a crisp or more
blurred distinction between dialogue acts and
non-communicative acts.
What is the relationship between dialogue acts
and dialogue structure
Wholly dependent on dialogue structure (grammar
based approaches)
Dialogue structure is primarily constructed from
the activity that the participants are engaged in
Dialogue structure is also used as context for
performance of dialogue act (question 8)48

49
Questions 13 14

Are there multi-agent dialogue acts?
Some researchers view the performance of most
illocutionary acts as a collective performance of
multiple agents, in virtue of the grounding
process
Games, exchanges and collaborative completions.
Problems with tagging.
Can dialogue acts be composed of more primitive
acts?
Could a multiple strata dialogue act taxonomy
have levels or ranks?

50
Question 15

Can multiple dialogue acts occur at the same time
(performed through the same utterance) ?
Since utterances have multiple functions, yes.
It is a problem if the logical theory does not
support simultaneous action
It has complications in Tagging

51
Taxonomic considerations(question 16 )

Can the same taxonomy be used for different kinds
of activities?
People have been designing taxonomies for
different dialogue activities.
A general theory might better allow one to use
act distributions to identify activities or
genres of activities as well as episodes within
an activity.

52
Percentage distributions of dialogue acts in
Corpus Coding
53
Questions 17 and 18

Can the same taxonomy used for different kinds of
agents?
Could the same taxonomy cover communicative
activities between
Human with human
Human with machine
Humans with animals etc.
Modality of communication also matters
How detailed should a dialogue act taxonomy be?
How many distinctions in speech act verbs should
be captured within a dialogue act taxonomy (e.g.
state, assert, inform)
Trade off between proposing many acts for subtle
differences and reliability of coding

54
Questions 19 and 20

How should complexity be realized in a coding
taxonomy?
How to capture multiplicity of functions in a
Taxonomy?
Multiple labels for each utterance, one for each
function (DRI, Allen and Core)
Bundle dialogue functions into one label
(Vermobil, Jekat et. Al)
Intermediate approach (DAMSL)
Can a Taxonomy be used for tagging dialogue
corpora be given a formal semantics and/or be
used in a dialogue system?
Hope is yes