Title: Automated Performance Assessment for Interactive Question Answering
1Automated Performance Assessment for Interactive
Question Answering
Department of Computer Science
EngineeringCollege of Engineering
Chen Zhang, Tyler Baldwin, and Joyce Chai
Data Collection
Introduction
Interactive question answering has been
identified as one of the important directions in
QA research 1. In interactive question
answering, users and QA systems take turns to ask
questions and provide answers. In such an
interactive setting, user questions largely
depend on the answers provided by the
system. Research question Whether user
follow-up questions and interaction context can
provide feedback for a QA system to automatically
assess its performance. Potential implication
The self-awareness can make QA systems more
intelligent for information seeking, for example,
by adapting better strategies to cope with
problematic situations.
- Wizard of Oz study
- A human wizard controls and simulates the
problematic situations to get balanced data
between problematic and error-free situations. - Users are not aware of the human wizard and
believe they are interacting with a real QA
system - Four topic scenarios, 11 users, and 44
interaction sessions.
Simulated Results
Performance Assessment
- Features
- Target matching whether the target type of Qi1
is the same as Qi (TM) - Named entity matching whether all the named
entities in Qi1 also appear in Qi (NEM) - Similarity between questions Qi1 and Qi (SQ)
- Similarity between content words of questions
Qi1 and Qi , excluding named entities (SQC) - Similarity between question Qi and answer Ai (SA)
- Similarity between content words of question Qi
and answer Ai (SAC)
- Classifiers
- Maximum Entropy Model (MaxEnt)
- Support Vector Machine (SVM)
- Decision Tree(Dtree)
Similarity Measurement The similarity between
two chunks of text T1 and T2 are measured by the
approach proposed by Lin 2
Q1
When was Tom Cruise born?
Problematic
A1
Based on the memoir of combat veteran Ron
Kovic, the film stars Tom Cruise as Kovic, whose
gunshot wound in Vietnam left him paralyzed from
the chest down. .a powerfully intimate portrait
that unfolds on an epic scale, Born on the Fourth
of July is arguably Stone's best film (if you can
forgive its often strident tone), ..
Results
Example
TMtrue NEMtrue SQ0.89 SQC0.68
SA0.87 SAC0.01
Qi When was Tom Cruise born?
Ai the film stars Tom Cruise as Kovic, Born
on the Fourth of July is arguably Stone's best
film ..
Q2
Re-try
Qi1 In what year was Tom Cruise born?
Problematic!
Whats the performance of this answer?
In what year was Tome Cruise born?
A2
Error-free
Thomas Cruise Mapother IV was born on the 3rd of
July, 1962 (eerily similar to his film Born on
the 4th of July), in Syracuse, New York. He was
the only boy of four children.
Conclusion
- User question behavior (e.g. rephrase) and
interaction context can provide important cues to
automatically identify problematic situations. - The best classifier achieves 73.8 accuracy,
which is 17 better than the baseline. - Automatic performance assessment will provide an
important step to move interactive QA towards
intelligent conversational QA.
Continue
Q3
What does Tom Cruise do for a living?
1 J. B. et al. Issues, tasks and program
structures to roadmap research in question
answering. In NIST Roadmap Document, 2001 2 D.
Lin. An information-theoretic definition of
similarity. In Proceedings of International
Conference on Machine Learning, Madison,
Wisconsin, July 1998
References