Ranking Automatically Generated Questions as a Shared Task

About This Presentation

Title:

Ranking Automatically Generated Questions as a Shared Task

Description:

In the summer of 55 BC, Julius Caesar invaded Britain with an army of two legions... An Oversimplified Possible User Interface ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 24

Provided by: keving1

Category:

more less

Transcript and Presenter's Notes

Title: Ranking Automatically Generated Questions as a Shared Task

1
Ranking Automatically GeneratedQuestions as a
Shared Task

Michael Heilman and Noah A. Smith

2
In the summer of 55 BC, Julius Caesar invaded
Britain with an army of two legions
3

When invaded Britain?

Who invaded Britain?

In the summer of 55 BC, who invaded Britain with
an army of two legions?

6
An Oversimplified Possible User Interface
Click on the checkboxes to select questions to
include on your reading quiz.

In the summer of 55 BC, who invaded Britain with
an army of two legions?
Who invaded Britain?
When invaded Britain?

7
Outline

Domain-General Factors
Shared Task Description
Annotating Questions
Semi-automated Evaluation

8
Domain-General Factors

Questions in almost all applications must
Be grammatical,
Use the correct WH-word,
Not have obvious answers,
Avoid vagueness,
Even domain specific applications would benefit
from techniques to ensure basic question quality.

9
Outline

Domain-General Factors
Shared Task Description
Annotating Questions
Semi-automated Evaluation

10
Shared Task Description
11
Our Overgenerator

We implemented a question overgenerator employing
various lexical and syntactic transformations.

12
Outline

Domain-General Factors
Shared Task Description
Annotating Questions
Semi-automated Evaluation

13
Annotating Questions

Undergraduates annotated overgenerated questions
produced by our system.

14
(No Transcript)
15
Annotation Scheme

Moderate inter-annotator agreement for
Acceptable vs. Not Acceptable ( )

16
Annotation Scheme

Possible Improvements to the Categories
Better descriptions
More examples
Revising/merging ones with low agreement
Possible Improvements to the Process
More training and spot-checking of annotators
More redundancy (e.g., Mechanical Turk)

17
Outline

Domain-General Factors
Shared Task Description
Annotating Questions
Semi-automated Evaluation

18
Semi-automated Evaluation

Once the questions are annotated, new systems can
be evaluated automatically.

?
1. 2. 3. 4. 5. 6. 7. 8.
?
?
?
?
?
?
?
19
Semi-automated Evaluation

Metric Precision _at_ N
questions in the top N that are acceptable.

20
Spectrum of feasible approaches to QG

Tailored to Domain/Task
Deep questions
Require expensive ontologies, etc.

Domain-general
Shallow questions
Easily generalizable

Our proposed task is over here.
21
Conclusion

The domain-generality of the proposed task and
its semi-automatic evaluation method are the
primary benefits that make it likely to succeed
in furthering QG research.

22
(No Transcript)
23
Annotation Scheme
Deficiency Description
Ungrammatical The question does not appear to be a valid English sentence.
Does not make sense The question is grammatical but indecipherable.
Vague The question is too vague to know exactly what it is asking about, even after reading the article.
Obvious Answer The correct answer would be obvious even to someone who has not read the article.
Missing Answer The answer to the question is not in the article.
Wrong WH word The question would be acceptable if the wh-phrase were different.
Formatting There are minor formatting errors.
Other There are other errors in the question that are not covered by any of the categories.
Acceptable None of the above deficiencies were present.

Write a Comment

User Comments (0)