Title: Ashish Vaswani
1Ashish Vaswani
- Speech acts for Dialogue agents, Coding schemes
and dialogue act taxonomies
2Speech acts for dialogue agents (Traum)
- Talks about the role of speech acts in allowing
an agent to participate in dialogue with another
agent - A dialogue agent is one that can interact and
communicate with other agents in a coherent
manner, not just with one-shot messages but with
a sequence of related messages all on the same
topic or In the service of an overall goal. - In studying speech acts, the focus is on
pragmatics rather than semantics i.e how is
language used by agents, and not what the
sentences mean.
3Foundational Philosophical speech act work
- Began with philosophers of language interested in
issues in Natural language pragmatics - Austin
- Utterance are used to do things
- Under favorable conditions, utterances can change
the mental and interactional state of the
participants. - Speaking is acting
- Three main divisions of speech acts.
- Locutionary act Act of saying something.
- Illocutionary act The act performed in saying
something. (viz, informing, warning etc.) - Composed of illocutionary force and propositional
content. - Indirect speech acts (Could you please pass the
salt ?) - Perlocutionary act The effect of the utterance
on the speaker (viz. persuasion, surprise etc.) - Classified illocutionary acts into several
categories based on illocutionary force
(verdictives, exercitives, commissives,
expositives and behavitives)
4Speech act work continued
- Searle
- Extended and refined Austins work on
illocutionary acts. - No necessary correspondence between illocutionary
acts and illocutionary verbs that a language
chooses to describe these acts. - Searle pointed out 13 different dimensions along
which speech acts could vary suggesting an
alternate taxonomy on purpose (his first
dimension) - Searles Taxonomy
- Representatives
- Directives
- Commissives
- Expressives
- Declarations
5AI models of speech acts
- Problem with early speech act work was that there
did not exist formal accounts of actions and
mental states that could be used to design more
precise definitions of speech acts. - Bruce First one to account of Speech Act Theory
in terms of actions and plans (AI) - Natural language generation is Social Action.
(beliefs, desires and wants) - Inform and request could be used in achieving
intentions to change states of belief.
6AI models of speech acts
- Cohen and Perrault
- Defined speech acts as plan operators that change
the beliefs of the speaker and hearer - Enumerated goals for an account of speech acts
- A plan based theory of speech acts should specify
a planning system and a definition of speech acts
as operators in the system - Mental state consists of beliefs and wants.
- They used a modified version of the STRIPS
planning system - cando preconditions and want preconditions for
operators - They modeled REQUEST and INFORM within their
system
7AI models of speech acts
- Allen and Perrault
- Used the same formalism as Cohen and Perrault
- Recognizing other agents plans important for
interpreting utterances. - Hinkelman use linguistic cues to build partial
speech act templates and plan inference for
utterance hypothesis
8AI models
- Perrault (non monotonic theory of speech acts)
- Utterance itself is insufficient to determine the
effects of a speech act (prior context, mental
state of agent, actual utterance) - Stated the effects in terms of default logic.
- Dynamic logic approaches
- Cohen and Levesque showed how effects of
illocutionary acts can be derived from general
principle of rational cooperative interaction
(sincerity and helpfulness) - Recognizing illocutionary force of an utterance
is not necessary, only cooperation. - Sadek uses a similar logic of rational action.
9Extending speech acts to Dialogue (Dialogue
function as action)
- Litman and Allen
- Extend Allen and Perraults work to include
dialogues and hierarchy of plans - Domain plans, discourse plans meta plans
- Carberry and Lambert add problem solving plans to
domain plans and discourse plans. - Cohen and Levesque extend their work into a
theory of joint intention and multi agent action - Why confirmations appear in dialogue. (belief of
object of intention)
10Multiple levels of interaction
- Attempts to model different kinds of dialogue
phenomena at different strata. (from sentence
level and upwards) - One early classification
- (transactions(exchanges(moves(acts))))
- Moves speech acts towards a particular purpose
- The exchange structure was also called a dialogue
game - In Traum and Hinkelman, there were levels of acts
rather than ranks
11Speech act based communicative languages
- Language based on Speech acts would itself be a
good agent communication language - KQML (knowledge query and manipulation language)
- Each message has an identifier (kind of action)
and other parameters specifying content. Based on
Austins performatives. - Problems with hidden speech acts.
12Speech Acts in multi agent action theory
- The main effects of speech acts are on the mental
and interactional states of the participants.
(BDI attitudes) - Social attitudes
- We must also consider social attitudes (question
Are social attitudes basic ?) - Mutual belief (Harman) A group of people have
mutual knowledge of p if each knows p and we know
this where this refers to the whole fact known. - Mutual belief is achieved through the process of
grounding. (Clark and Schafer) - Obligations are necessary for modeling social
situations (viz. a hearer is obligated to answer
a question if posed one). What an agent should
do. - Problem How do you decide social norms?
- Obligations might conflict with the agents goals
and he might choose to violate them (e.g,
interrogation) - Another social attitude is joint intention or
shared plan. Coordinated team activity depends on
more than only individual intentions and beliefs.
(how do shared intentions guide individual action
?)
13Speech acts in multi agent action theory
- Defining speech acts
- How can one give precise definitions of speech
acts using mental state and action? - How can one recognize whether such an act has
been performed? (because of involvement of mental
states, an observer might not be able to tell) - How can agents plan to use speech acts to
accomplish their goals? - Traum Plan recipe for communication
14continued
- Planning speech acts
- Acts can be planned as games, or single moves.
- How far ahead should an agent plan?
- The future actions of agents are inaccurate.
- Negotiations, arguments (more planning), casual
conversation (no planning) - Recognizing speech acts
- Combination of input utterance with aspects of
current context to decide what acts have been
performed (for example, current context says that
an INFORM act might be impending) - Should the agent just recognize the acts or the
intentions also (This might be necessary for
interpreting indirect speech acts) - How much of the plan should be inferred? Deep
intention recognition might not be necessary
instead considering all possible actions and
their immediate effects is sufficient when
combined with facility to repair erroneous
conclusions. (default logic?) (McRoy) - Grounding relaxes the need for intention
recognition since it can help in realizing
motivations as the speaker is easily accessible.
15The reliability of a Dialogue structure coding
scheme (Carletta et al)
- Paper aims at introducing and describing the
reliability of a scheme of dialogue coding
distinctions for a Map task corpus - In the Map Task, two participants have slightly
different versions of a simple map with
approximately fifteen landmarks on it. One
participant's map has a route printed on it the
task is for the other participant to duplicate
the route. - The moves introduced is independent of the task.
- They attempt to classify dialogue structure at
higher level also (Transactions and games) - The dialogue structure can be used with codings
of many other dialogue phenomena.
16The dialogue structure coding
- Transactions
- Highest level
- Subdialogues that accomplish one major step in
the participants plan for achieving the task. - Size and shape depend on the task
- Conversational games (dialogue games)
- A conversational game is a set of utterances
starting with an initiation and encompassing all
utterances up until the purpose of the game has
been either fulfilled (e.g., the requested
information has been transferred) or abandoned. - Games can nest within each other
- Games are made up of Conversational moves which
are different kinds of initiations and responses
17The move coding scheme
18The move coding scheme (moves)
- Instruct move
- move commands the partner to carry out an action.
- Expected response could be performance of action
if the participant knows the action. - G Go right round, ehm, until you get to just
above them. - Explain move
- States information that has not been directly
elicited by the partner. - Facts about the domain, state of plan or task,
including facts that help establish what is
mutually known - G Where the dead tree is on the other side of
the stream there's farmed land.
19Move coding scheme
- Check move
- Requests the partner to confirm information that
the speaker has some reason to believe, but is
not entirely sure about.
20Move coding scheme
- Align move
- checks the partner's attention, agreement, or
readiness for the next move. - most common type of ALIGN move is for the
transferer to know that the information has been
successfully transferred, so that they can close
that part of the dialogue and move on.
21Move coding scheme
- Query-YN move
- asks the partner any question that takes a yes or
no answer and does not count as a CHECK or an
ALIGN - These questions are most often about what the
partner has on the map - F I've got Dutch Elm.
- G Dutch Elm. Is it written underneath the tree?
22Move coding scheme
- The Query-W move
- is any query not covered by the other categories
- most moves classified as QUERY-W are wh-questions
23Move coding scheme (Response moves)
- Used within games after an initiation and try to
fulfill expectations in the game - Acknowledge move
- verbal response that minimally shows that the
speaker has heard the move to which it responds,
and often also demonstrates that the move was
understood and accepted. - only the last three (from Clark and Schafers
evidences for acknowledge) count as ACKNOWLEDGE
moves in this coding scheme - G Ehm, if you ... you're heading southwards.
- F Mmhmm.
24Move coding scheme
- Reply- Y move
- any reply to any query with a yes-no surface form
that means "yes", however that is expressed - normally only appear after QUERY-YN, ALIGN, and
CHECK moves. - G See the third seagull along?
- F Yeah.
- Reply N move
- reply to a query with a yes-no surface form, that
means "no - G Do you have the west lake, down to your left?
- F No.
25Move coding scheme
- Reply W move
- any reply to any type of query that doesn't
simply mean "yes" or "no. - G And then below that, what've you got?
- F A forest stream.
- Clarify move
- reply to some kind of question in which the
speaker tells the partner something over and
above what was strictly asked. - Route givers tend to make CLARIFY moves when the
route follower seems unsure of what to do, but
there isn't a specific problem on the agenda
26Move coding scheme
- Other possible responses
- Utterances where the responder refuses to share
the same goal as the initiator (No, lets talk
about..) - ACKNOWLEDGE moves with a negative slant
- Sufficiently rare in the corpora.
- READY move
- moves that occur after the close of a dialogue
game and prepare the conversation for a new game
to be initiated. - G Okay. Now go straight down.
- Confusion That could have been an acknowledge
move too
27Coding continued
- Game coding scheme
- Beginning of new games are coded by purpose
- Place where games end or are abandoned are marked
- Marked as either occurring at top level or being
embedded in the game structure - Transaction coding scheme
- Four transaction types
- NORMAL Transaction serving a subtask viz. a
route segment on the map. - REVEW Transactions created when participants
return to parts of the route that have already
been completed - OVERVIEW Overviewing an upcoming segment in
order to provide a context for the partner. - IRRELEVANT Subdialogues not relevant to of the
route (maybe about the experimental setup) - Coding involves marking in the dialogue where the
transaction starts except for IRRELEVANT
transactions. - Ends of transactions are not coded.
-
28Reliability of coding scheme
- Tests of reliability
- Krippendorffs tests of reliability
- Stability
- Reproducibility
- Accuracy
- Agreement by coders on segmentation
- Used kappa coefficient for reliability of
classification.
29Reliability of coding
- Refliability of move coding
- Four coders
- Each coder had access to the speech as well as
transcripts - All coders interacted verbally with the
developers - Reliability of move segmentation
- Kappa .92 using word boundaries as units
- Pairwise percent agreement on locations where any
coder had marked a boundary was 89. - No of units 4079. No of boundaries 796
- Most errors were with marking READY separately or
marking it in the move that followed and marking
a reply or a splitting it into a reply and
EXPLAIN, CLARIFY etc.
30Reliability of coding
- Reliability of move classification
- Since the reliability of segmentation was good,
it gave a good foundation for move classification - Move classification was evaluated only over move
segments where the boundaries were agreed - Kappa for move coding 0.83
- Largest confusions between
- CHECK and QUERY-YN
- INSTRUCT and CLARIFY
- ACKNOWLEDGE, READY and REPLY-Y
- K 0.89 for coding with Initiation a command, a
statement or a question
31Reliability of coding
- Reliability of move classification from Written
instructions - K 0.69
- Reliability of move coding in Another domain
- Transcribed conversation between a hi-fi sales
assistant and a married couple intending to
purchase an amplifier - K 0.95 for move segmentation
- K 0.81 for move classification
- Reliability of game coding
- Pairwise agreement on game beginnings 70
- Reliability of Transaction coding
- Done from written instructions
- K 0.59
-
32Coding Dialogues with the DAMSL Annotation scheme
(Mark Core and James F Allen)
- DAMSL (Dialogue Act Markup In Several Layers)
- Automatic analysis of Dialogue needed for
- Computer acting as participant with users
- Computer as observer interpreting human speech
- DAMSL allows multiple labels in multiple layers
to be applied to an utterance - Communicative actions described here are high
level.
33DAMSL annotation scheme
- Forward communicative functions
- Speech acts that affect the future of dialogue
- These categories are independent
- Divided into
- Representatives (statements) Making claims about
the world - Speaker trying to affect the beliefs of the
hearer- Assert - Repeating information for emphasis or
acknowledgement-Reassert - Influencing-Addressee-Future-Action
- All utterances that discuss potential actions of
the addressee - Directives
- Info Request Questions and Requests (tell me the
time) - Action Directive Requests for action (Please
take out the trash) - Open-Option
- Speaker gives a potential course of action but
does not show preference towards it - Commissives (Committing-Speaker-Future-Action)
- Offers
- Commitments
- Perfomative catetory
- Utterances that make a fact true in virtue of
their content (You are fired)
34DAMSL annotation scheme
- Backward communicative function
- The speech act categories related to responses
- The classes are independent
- Agreement
- Accept, accept-part, Maybe, Reject-part, reject,
hold - Understanding
- Did the listener understand the speaker?
- The listener may
- Signal-non Understanding
- Signal understanding (Acknowledgements,
Repeat-Rephrase, completion) - Correct Misspeaking
- Answer
- Supplying information explicitly requested by a
previous Info-Request act - Information relations
- Describe how the information in the current
utterance relates to previous utterances
35- Utterance features
- Information Level
- Task (utterance about the task)
- Task Management (utterance about the planning and
monitoring of task) - Communication management (Physical requirements
of dialogue) - Other
- Communicative Status
- Abandoned
- Uninterpretable
- Syntactic Features
- Conventional form (hello, how may I help you)
- Exclamatory form (wow)
-
36Experiments
- Used test dialogues from the TRAINS 91-93
dialogues. - A person was given a problem to solve viz.
shipping box cars to a city and another person
was instructed to act as a problem solving
system.
37Results
- Three statistics were used to measure
interannotator reliability. - PA percent pairwise agreement
- PE- Expected pairwise agreement
- Kappa (PA-PE)/1-PE
38Results
39An emperical investigation of proposals in
Collaborative Dialogues Barbara et al.
- They use a slight modification of the DRI
(Discourse resource initiative) scheme. - Task (will be read out)
- The DRI coding scheme
- Similar and Simpler than the DAMSL scheme
discussed before. - Forward looking functions
- This dimension characterizes the potential effect
that an utterance Ui has on the subsequent
dialogue. - Statement Make claims about the world.
- Assert (Speaker trying to change Hearers beliefs)
- Reassert (if the claim has already been made
before) - Influence on hearer (I-on-H)
- Influences Hs future action
- Open option
- Info Request
- Action directives
- Influence on Speaker (I-on-S)
- Commits S to some future course of action
- Offer
- commit
40DRI coding scheme
- Backward looking functions
- Ui has to do with response
- Answer
- Agreement
- Accept/reject
- Holds
- Certain refinements were made to the core
features by adding heuristics for tagging
Statements, I-on-H and I-on-S. -
41Coding results
- Their results on forward functions were better
than Core and Allens (97) - Very low Kappa value for agreement
42Twenty questions for Dialogue act taxonomies
(Traum)
- Defining dialogue acts
- Question 1.
- Which is most important fit to intuitions or
formal rigor? - Difficult to precisely formulate complex
intuitions using available formal techniques - Sacrifice intuition for formal rigor or vice
versa? - Answer will depend on the purpose of the concept.
(experimentation or verfication)
43Question 2 3
- Is the definition of a dialogue act an issue of
lexical semantics or ontology of action? - Is defining providing an account when someone
might be justified in describing a sentenced
headed with a particular verb (inform, request),
or to provide a technical vocabulary to compactly
describe various types of occurences? (the speech
acts in the third paper) - Under what conditions may an action said to have
occurred? - Allwood uses 4 criteria
- Intention of performer
- Form of behavior (eg linguistic form , question
2?) - Achieved result
- Context in which the behavior occurs.
- Avoid defining DAs according to, say a certain
set of results holding and then identify
instances of these acts using one of the other
criteria say, linguistic form. This would lead to
coding difficulties -
44Question 4 5
- What is the role of speaker intention
- Some would define dialogue acts on the basis of
intention behind them - Some would define it with the recognition of this
intention (illocutionary acts) - What is the role of addressee uptake
- Many dialogue act definitions require some
changes to the addressee based on understanding
of the utterance in a particular way
45Question 6
- What view should be taken regarding the
performance of acts? - Speakers and listeners view
- View of the speaker addressee team, normative
conventional point of view. - Is one allowed to consider subsequent utterances
before deciding performance - This has implications while coding.
46Dialogue act components(questions 7 and 8)
- How are actions used in a logic?
- What is context?
- What aspects of the situation are relevant as
potential conditions for defining types of
dialogue act performance and what aspects are
(directly) affected. - Special sorts of information used for conditions
and effects of dialogue acts - Dialogue state (pre dialogue be in a particular
state, effect transition to a new dialogue
state) - Mental states (effect newly adopted beliefs)
- Social obligations and commitments
47Questions 9 10
- What kind of preconditions are appropriate
- Most convenient dialogue acts have few, if any
actual preconditions - How should an unsuccessful act be distinguished
from a failed attempt to perform an act? - Difference between the success and satisfaction
of a speech act
48Relationships and complex acts(question 11 and 12)
- What is the relationship between dialogue acts
and other (e.g., physical) acts? - Different theories would maintain a crisp or more
blurred distinction between dialogue acts and
non-communicative acts. - What is the relationship between dialogue acts
and dialogue structure - Wholly dependent on dialogue structure (grammar
based approaches) - Dialogue structure is primarily constructed from
the activity that the participants are engaged in - Dialogue structure is also used as context for
performance of dialogue act (question 8)48
49Questions 13 14
- Are there multi-agent dialogue acts?
- Some researchers view the performance of most
illocutionary acts as a collective performance of
multiple agents, in virtue of the grounding
process - Games, exchanges and collaborative completions.
- Problems with tagging.
- Can dialogue acts be composed of more primitive
acts? - Could a multiple strata dialogue act taxonomy
have levels or ranks?
50Question 15
- Can multiple dialogue acts occur at the same time
(performed through the same utterance) ? - Since utterances have multiple functions, yes.
- It is a problem if the logical theory does not
support simultaneous action - It has complications in Tagging
51Taxonomic considerations(question 16 )
- Can the same taxonomy be used for different kinds
of activities? - People have been designing taxonomies for
different dialogue activities. - A general theory might better allow one to use
act distributions to identify activities or
genres of activities as well as episodes within
an activity.
52Percentage distributions of dialogue acts in
Corpus Coding
53Questions 17 and 18
- Can the same taxonomy used for different kinds of
agents? - Could the same taxonomy cover communicative
activities between - Human with human
- Human with machine
- Humans with animals etc.
- Modality of communication also matters
- How detailed should a dialogue act taxonomy be?
- How many distinctions in speech act verbs should
be captured within a dialogue act taxonomy (e.g.
state, assert, inform) - Trade off between proposing many acts for subtle
differences and reliability of coding -
54Questions 19 and 20
- How should complexity be realized in a coding
taxonomy? - How to capture multiplicity of functions in a
Taxonomy? - Multiple labels for each utterance, one for each
function (DRI, Allen and Core) - Bundle dialogue functions into one label
(Vermobil, Jekat et. Al) - Intermediate approach (DAMSL)
- Can a Taxonomy be used for tagging dialogue
corpora be given a formal semantics and/or be
used in a dialogue system? - Hope is yes
-