I256: Applied Natural Language Processing - PowerPoint PPT Presentation

About This Presentation

Title:

I256: Applied Natural Language Processing

Description:

take an information source, extract the most important content from it and ... Judges were told to extract 25% of the sentences, to maximize coherence, minimize ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 32

Provided by: coursesIs

Learn more at: https://courses.ischool.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: I256: Applied Natural Language Processing

1
I256 Applied Natural Language Processing
Marti Hearst Oct 2, 2006
2
Contents

Introduction and Applications
Types of summarization tasks
Basic paradigms
Single document summarization
Evaluation methods

3
Introduction

The problem Information overload
4 Billion URLs indexed by Google
200 TB of data on the Web Lyman and Varian 03
Information is created every day in enormous
amounts
One solution summarization
Abstracts promote current awareness
save reading time
facilitate selection
facilitate literature searches
aid in the preparation of reviews
But what is an abstract??

4
Introduction

abstract
brief but accurate representation of the
contents of a document
goal
take an information source, extract the most
important content from it and present it to the
user in a condensed form and in a manner
sensitive to the users needs.
compression
the amount of text to present or the length of
the summary to the length of the source.

5
History

The problem has been addressed since the
50sLuhn 58
Numerous methods are currently being suggested
Most methods still rely on 50s-70s algorithms
Problem is still hard yet there are some
applications
MS Word,
www.newsinessence.com by Drago Radevs research
group

6
(No Transcript)
7
MSWord AutoSummarize
8
Applications

Abstracts for Scientific and other articles
News summarization (mostly multiple document
summarization)
Classification of articles and other written data
Web pages for search engines
Web access from PDAs, Cell phones
Question answering and data gathering

9
Types of Summaries

Indicative vs Informative
Informative a substitute for the entire document
Indicative give an idea of what is there
Background
Does the reader have the needed prior knowledge?
Expert reader vs Novice reader
Query based or General
Query based a form is being filled, answers
should be answered
General General purpose summarization

10
Types of Summaries (input)

Single document vs multiple documents
Domain specific (chemistry) or general
Genre specific (newspaper items) of general

11
Types of Summaries (output)

extract vs abstract
Extracts representative paragraphs/sentences/
phrases/words, fragments of the original text
Abstracts a concise summary of the central
subjects in the document.
Research shows that sometimes readers prefer
Extracts!
language chosen for summarization
format of the resulting summary
(table/paragraph/key words)

12
Methods

Quantitative heuristics, manually scored
Machine-learning based statistical scoring
methods
Higher semantic/syntactic structures
Network (graph) based methods
Other methods (rhetorical analysis, lexical
chains, co-reference chains)
AI methods

13
Quantitative Heuristics

General method
score each entity (sentence, word) combine
scores choose best sentence(s)
Scoring techniques
Word frequencies throughout the text (Luhn 58)
Position in the text (Edmunson 69, LinHovy 97)
Title method (Edmunson 69)
Cue phrases in sentences (Edmunson 69)

14
Using Word Frequencies (Luhn 58)

Very first work in automated summarization
Assumptions
Frequent words indicate the topic
Frequent means with reference to the corpus
frequency
Clusters of frequent words indicate summarizing
sentence
Stemming based on similar prefix characters
Very common words and very rare words are ignored

15
Ranked Word Frequency
Zipfs curve
16
Word frequencies (Luhn 58)

Find consecutive sequences of high-weight
keywords
Allow a certain number of gaps of low-weight
terms
Sentences with highest sum of cluster weights are
chosen

17
Position in the text (Edmunson 69)

Claim Important sentences occur in specific
positions
lead-based summary
inverse of position in document works well for
the news
Important information occurs in specific sections
of the document (introduction/conclusion)

18
Title method (Edmunson 69)

Claim title of document indicates its content
Unless editors are being cute
Not true for novels usually
What about blogs ?
words in title help find relevant content
create a list of title words, remove stop words
Use those as keywords in order to find important
sentences
(for example with Luhns methods)

19
Cue phrases method (Edmunson 69)

Claim Important sentences contain cue
words/indicative phrases
The main aim of the present paper is to
describe (IND)
The purpose of this article is to review (IND)
In this report, we outline (IND)
Our investigation has shown that (INF)
Some words are considered bonus others stigma
bonus comparatives, superlatives, conclusive
expressions, etc.
stigma negatives, pronouns, etc.

20
Feature combination (Edmundson 69)

Linear contribution of 4 features
title, cue, keyword, position
the weights are adjusted using training data with
any minimization technique
Evaluated on a corpus of 200 chemistry articles
Length ranged from 100 to 3900 words
Judges were told to extract 25 of the sentences,
to maximize coherence, minimize redundancy.
Features
Position (sensitive to types of headings for
sections)
cue
title
keyword
Best results obtained with
cue title position

21
Bayesian Classifier (Kupiec at el 95)

Statistical learning method
Feature set
sentence length
S gt 5
fixed phrases
26 manually chosen
paragraph
sentence position in paragraph

thematic words
binary whether sentence is included in manual
extract
uppercase words
not common acronyms
Corpus
188 document summary pairs from scientific
journals

22
Bayesian Classifier (Kupiec at el 95)

Uses Bayesian classifier

Assuming statistical independence

23
Bayesian Classifier (Kupiec at el 95)

Each Probability is calculated empirically from a
corpus
Higher probability sentences are chosed to be in
the summary
Performance
For 25 summaries, 84 precision

24
Evaluation methods

When a manual summary is available
1. choose a granularity (clause sentence
paragraph),
2. create a similarity measure for that
granularity (word overlap multi-word overlap,
perfect match),
3. measure the similarity of each unit in the new
to the most similar unit(s)
4. measure Recall and Precision.
Otherwise
1. Intrinsic how good is the summary as a
summary?
2. Extrinsic how well does the summary help the
user?

25
Intrinsic measures

Intrinsic measures (glass-box) how good is the
summary as a summary?
Problem how do you measure the goodness of a
summary?
Studies compare to ideal (Edmundson, 69 Kupiec
et al., 95 Salton et al., 97 Marcu, 97) or
supply criteriafluency, informativeness,
coverage, etc. (Brandow et al., 95).
Summary evaluated on its own or comparing it with
the source
Is the text cohesive and coherent?
Does it contain the main topics of the document?
Are important topics omitted?

26
Extrinsic measures

(Black box) how well does the summary help a
user with a task?
Problem does summary quality correlate with
performance?
Studies GMAT tests (Morris et al., 92) news
analysis (Miike et al. 94) IR (Mani and
Bloedorn, 97) text categorization (SUMMAC 98
Sundheim, 98).
Evaluation in an specific task
Can the summary be used instead of the document?
Can the document be classified by reading the
summary?
Can we answer questions by reading the summary?

27
The Document Understanding Conference (DUC)

This is really the Text Summarization Competition
Started in 2001
Task and Evaluation (for 2001-2004)
Various target sizes were used (10-400 words)
Both single and multiple-document summaries
assessed
Summaries were manually judged for both content
and readability.
Each peer (human or automatic) summary was
compared against a single model summary
using SEE (http//www.isi.edu/ cyl/SEE/)
estimates the percentage of information in the
model thatwas covered in the peer.
Also used ROUGE (Lin 04) in 2004
Recall-Oriented Understudy for Gisting Evaluation
Uses counts of n-gram overlap between candidate
and gold-standard summary, assumes fixed-length
summaries

28
The Document Understanding Conference (DUC)

Made a big change in 2005
Extrinsic evaluation proposed but rejected (write
a natural disaster summary)
Instead a complex question-focused summarization
task that required summarizers to piece together
information from multiple documents to answer a
question or set of questions as posed in a DUC
topic.
Also indicated a desired granularity of
information

29
The Document Understanding Conference (DUC)