Ranking Automatically Generated Questions as a Shared Task - PowerPoint PPT Presentation

About This Presentation
Title:

Ranking Automatically Generated Questions as a Shared Task

Description:

In the summer of 55 BC, Julius Caesar invaded Britain with an army of two legions... An Oversimplified Possible User Interface ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 24
Provided by: keving1
Category:

less

Transcript and Presenter's Notes

Title: Ranking Automatically Generated Questions as a Shared Task


1
Ranking Automatically GeneratedQuestions as a
Shared Task
  • Michael Heilman and Noah A. Smith

2
In the summer of 55 BC, Julius Caesar invaded
Britain with an army of two legions
3
  • When invaded Britain?

4
  • Who invaded Britain?

5
  • In the summer of 55 BC, who invaded Britain with
    an army of two legions?

6
An Oversimplified Possible User Interface
Click on the checkboxes to select questions to
include on your reading quiz.
  • In the summer of 55 BC, who invaded Britain with
    an army of two legions?
  • Who invaded Britain?
  • When invaded Britain?

7
Outline
  • Domain-General Factors
  • Shared Task Description
  • Annotating Questions
  • Semi-automated Evaluation

8
Domain-General Factors
  • Questions in almost all applications must
  • Be grammatical,
  • Use the correct WH-word,
  • Not have obvious answers,
  • Avoid vagueness,
  • Even domain specific applications would benefit
    from techniques to ensure basic question quality.

9
Outline
  • Domain-General Factors
  • Shared Task Description
  • Annotating Questions
  • Semi-automated Evaluation

10
Shared Task Description
11
Our Overgenerator
  • We implemented a question overgenerator employing
    various lexical and syntactic transformations.

12
Outline
  • Domain-General Factors
  • Shared Task Description
  • Annotating Questions
  • Semi-automated Evaluation

13
Annotating Questions
  • Undergraduates annotated overgenerated questions
    produced by our system.

14
(No Transcript)
15
Annotation Scheme
  • Moderate inter-annotator agreement for
    Acceptable vs. Not Acceptable ( )

16
Annotation Scheme
  • Possible Improvements to the Categories
  • Better descriptions
  • More examples
  • Revising/merging ones with low agreement
  • Possible Improvements to the Process
  • More training and spot-checking of annotators
  • More redundancy (e.g., Mechanical Turk)

17
Outline
  • Domain-General Factors
  • Shared Task Description
  • Annotating Questions
  • Semi-automated Evaluation

18
Semi-automated Evaluation
  • Once the questions are annotated, new systems can
    be evaluated automatically.

?
1. 2. 3. 4. 5. 6. 7. 8.
?
?
?
?
?
?
?
19
Semi-automated Evaluation
  • Metric Precision _at_ N
  • questions in the top N that are acceptable.

20
Spectrum of feasible approaches to QG
  • Tailored to Domain/Task
  • Deep questions
  • Require expensive ontologies, etc.
  • Domain-general
  • Shallow questions
  • Easily generalizable

Our proposed task is over here.
21
Conclusion
  • The domain-generality of the proposed task and
    its semi-automatic evaluation method are the
    primary benefits that make it likely to succeed
    in furthering QG research.

22
(No Transcript)
23
Annotation Scheme
Deficiency Description
Ungrammatical The question does not appear to be a valid English sentence.
Does not make sense The question is grammatical but indecipherable.
Vague The question is too vague to know exactly what it is asking about, even after reading the article.
Obvious Answer The correct answer would be obvious even to someone who has not read the article.
Missing Answer The answer to the question is not in the article.
Wrong WH word The question would be acceptable if the wh-phrase were different.
Formatting There are minor formatting errors.
Other There are other errors in the question that are not covered by any of the categories.
Acceptable None of the above deficiencies were present.
Write a Comment
User Comments (0)
About PowerShow.com