Learning Extraction Patterns for Subjective Expressions - PowerPoint PPT Presentation

About This Presentation
Title:

Learning Extraction Patterns for Subjective Expressions

Description:

Title: NRRC Summer Workshop: Multi-Perspective Question Answering Author: Janyce M. Wiebe Last modified by: wiebe Created Date: 6/5/2002 1:23:10 PM – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 64
Provided by: Jany7
Category:

less

Transcript and Presenter's Notes

Title: Learning Extraction Patterns for Subjective Expressions


1
Learning Extraction Patterns for Subjective
Expressions
  • Ellen Riloff Janyce Wiebe
  • University of Utah University of Pittsburgh

2
Subjectivity
  • Subjective language includes opinions, rants,
    allegations, accusations, suspicions, and
    speculation
  • Distinguishing factual from subjective
    information could benefit many applications
  • information extraction
  • question answering
  • summarization

3
Goals
  • Sentence-level subjectivity classification
  • (Wiebe et al. 2001) found that 44 of sentences
    in news articles are subjective
  • Learning subjectivity clues from unannotated text
    corpora
  • Learning linguistically rich patterns

4
Previous Work Subjectivity Analysis
  • Document-level subjectivity classification (e.g.,
    Turney 2002 Pang et al 2002 Spertus 1997) and
    above (Tong 2001)
  • Genre classification (e.g., Karlgren and Cutting
    1994 Kessler et al. 1997 Wiebe et al. 2001)
  • Supervised sentence-level classification (Wiebe
    et al 1999)
  • Learning adjectives, adjectival phrases, verbs,
    nouns, and N-grams (e.g., Turney 2002
    Hatzivassiloglou McKeown 1997 Wiebe et al.
    2001)

5
Recent Related Work
  • Yu and Hatzivassiloglou (EMNLP03). Unsupervised
    sentence level classification. Complementary
    approach and features.
  • Dave et al. (WWW03) reviews classified as
    positive or negative.
  • Agrawal et al. (WWW03) newsgroup authors
    partitioned into camps based on quotation links
  • Gordon et al. (ACL03) manually developed
    grammars for some types of subjective language

6
Extraction Patterns
  • Extraction patterns are lexico-syntactic patterns
    to identify relevant information
  • Typically they represent role relationships
    surrounding noun and verb phrases
  • hijacking of ltxgt hijacked vehicle
  • ltxgt was hijacked hijacked vehicle
  • ltxgt hijacked hijacker

7
Our Method
  • Subjective expressions represented as extraction
    patterns
  • get to know ltdobjgt ltsubjgt appear to be
    ltsubjgt was satisfied ltsubjgt complained
  • Supervised extraction pattern learning
  • Training data generated automatically
  • Entire process bootstrapped

8
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
9
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
10
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
11
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
12
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
13
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
14
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
15
Unannotated Text Collection
English language versions of FBIS news articles
from a variety of countries.
Size 302,160 sentences
16
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
17
  • From previous work
  • Manually identified
  • (e.g, entries from Levin 1993)
  • Automatically identified
  • (e.g., nouns from Riloff et al. 2003)

Known subjective vocabulary
18
  • From previous work
  • Manually identified
  • (e.g, entries from Levin 1993)
  • Automatically identified
  • (e.g., nouns from Riloff et al. 2003)

Known subjective vocabulary
  • Strongly subjective most instances
  • subjective
  • Weakly subjective objective instances
  • also common

19
  • From previous work
  • Manually identified
  • (e.g, entries from Levin 1993)
  • Automatically identified
  • (e.g., nouns from Riloff et al. 2003)

Known subjective vocabulary
Any data used is separate from data in this paper
  • Strongly subjective most instances
  • subjective
  • Weakly subjective objective instances
  • also common

20
Unannotated Text Collection
unlabeled sentences
subjective sentences
Subjective gt1 strongly subjective Classifier
clue
Known subjective vocabulary
unlabeled sentences
91.3 Precision 31.9 Recall Test set 2197
sentences 59 subjective
objective sentences
Objective Classifier
21
Unannotated Text Collection
unlabeled sentences
Subjective gt1 strongly subjective Classifier
clue
Known subjective vocabulary
unlabeled sentences
Objective 0 strongly subjective clue
Classifier 0 or 1 weakly subjective clue in
previous, current, next sentence
objective sentences
82.6 Precision 16.4 Recall
22
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
23
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
24
17,000
Subjective Classifier
subjective sentences
relevant texts
17,000
Extraction Pattern AutoSlog-TS Learner
Riloff 1996
objective sentences
Objective Classifier
subjective patterns
irrelevant texts
25
Step 1 Apply Syntactic Templates
  • ltsubjgtactive-verb dobj ltsubjgt dealt blow
  • ltsubjgt verb infinitive ltsubjgt appear to be
  • ltsubjgt aux noun ltsubjgt has position
  • Active-verb ltdobjgt endorsed ltdobjgt
  • Verb infinitive ltdobjgt get to know ltdobjgt
  • Noun prep ltnpgt opinion on ltnpgt
  • Infinitive prep ltnpgt to resort to ltnpgt

26
Step 1 Apply Syntactic Templates
  • ltsubjgtactive-verb dobj ltsubjgt dealt blow
  • ltsubjgt verb infinitive ltsubjgt appear to be
  • ltsubjgt aux noun ltsubjgt has position
  • Active-verb ltdobjgt endorsed ltdobjgt
  • Verb infinitive ltdobjgt get to know ltdobjgt
  • Noun prep ltnpgt opinion on ltnpgt
  • Infinitive prep ltnpgt to resort to ltnpgt

27
Step 1 Apply Syntactic Templates
  • ltsubjgtactive-verb dobj ltsubjgt dealt blow
  • Matches any sentence with
  • verb phrase with headdealt
  • direct object with headblow.
  • The experience certainly dealt a stiff blow to
    his pride.

28
Step 2 Select Patterns
  • Apply all learned patterns to training data
  • Rank patterns
  • Prec(pattern) p(subjective pattern)
  • in subjective sentences / total
  • Choose patterns with
  • Frequency gt F
  • Prec gt P
  • on the training data for some F and P

29
Examples from Training Data
SUBJ
ltsubjgt was asked 100
ltsubjgt asked 63
ltsubjgt is talk 100
talk of ltnpgt 90
ltsubjgt will talk 71
was expected from ltnpgt 100
ltsubjgt was expected 42
ltsubjgt is fact 100
fact is ltdobjgt 100
30
Examples from Training Data
SUBJ
ltsubjgt was asked 100
ltsubjgt asked 63
ltsubjgt is talk 100
talk of ltnpgt 90
ltsubjgt will talk 71
was expected from ltnpgt 100
ltsubjgt was expected 42
ltsubjgt is fact 100
fact is ltdobjgt 100
31
Examples from Training Data
SUBJ
ltsubjgt was asked 100
ltsubjgt asked 63
ltsubjgt is talk 100
talk of ltnpgt 90
ltsubjgt will talk 71
was expected from ltnpgt 100
ltsubjgt was expected 42
ltsubjgt is fact 100
fact is ltdobjgt 100
32
Examples from Training Data
SUBJ
ltsubjgt was asked 100
ltsubjgt asked 63
ltsubjgt is talk 100
talk of ltnpgt 90
ltsubjgt will talk 71
was expected from ltnpgt 100
ltsubjgt was expected 42
ltsubjgt is fact 100
fact is ltdobjgt 100
33
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
34
Test Data
  • Manual annotation to support project
    investigating multiple perspective QA (ARDA
    AQUAINT NRRC)
  • 0.77 ave pair-wise kappa
  • 0.89 ave pair-wise kappa with borderline
    sentences removed (11 of the corpus)
  • Wilson Wiebe, SIGDIAL 2003, describes the
    annotation scheme and agreement study

35
Example
The Foreign Ministry said Thursday that it was
surprised, to put it mildly
by the U.S. State Departments criticism of
Russias human rights
record and objected in particular to the odious
section on Chechnya.
36
(No Transcript)
37
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
38
Evaluation of Learned Patterns
  • Test data
  • 3947 sentences
  • 54 subjective
  • Train Test
  • F gt 9 P 100 P 85 Recall 41
  • F gt 1 P gt 59 P 71 Recall 92

39
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
40
17000
Subjective Classifier
subjective sentences
unlabeled sentences
Extraction Pattern Learner
17000
Objective Classifier
objective sentences
subjective patterns
new subjective sentences
unlabeled sentences
Pattern-Based Subjective Classifier
41
17000
Subjective Classifier
subjective sentences
unlabeled sentences
Extraction Pattern Learner
17000
Objective Classifier
objective sentences
subjective patterns
9500 new subjective sentences
unlabeled sentences
Pattern-Based Subjective Classifier gt 0 instances
of patterns with F gt4 P 1 on training data
42
17000
7500
Subjective Classifier
subjective sentences
unlabeled sentences
Extraction Pattern Learner
9500 new subjective sentences
17000
Objective Classifier
objective sentences
unlabeled sentences
Pattern-Based Subjective Classifier
43
17000
7500
Subjective Classifier
subjective sentences
unlabeled sentences
Extraction Pattern Learner
9500 new subjective sentences
17000
Objective Classifier
objective sentences
new subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
4248 patterns P gt .59 on training data 308
patterns P 1.0 on training data
44
17000
7500
Subjective Classifier
subjective sentences
unlabeled sentences
Extraction Pattern Learner
9500 new subjective sentences
17000
Objective Classifier
objective sentences
new subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
Evaluate new old patterns on test set
Recall 24 Prec -0.52
45
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
46
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
Extraction Pattern Learner
47
unlabeled sentences
subjective patterns F gt 9, P 1.0 on training
data
Subjective Classifier New subjective Sentences
1 old clue 1 new gt1 new
old new subjective sentences
Extraction Pattern Learner
Known subjective vocabulary
48
unlabeled sentences
subjective patterns F gt 9, P 1.0 on training
data
Subjective Classifier New subjective Sentences
1 old clue 1 new gt1 new
old new subjective sentences
Extraction Pattern Learner
Known subjective vocabulary
49
Evaluation on Test Data
  • Original subjective classifier
  • 32.9 recall 91.3 precision
  • Augmented subjective classifier
  • 40.1 recall 90.2 precision

50
Future Work
51
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
52
  • Improve original high-precision classifier
  • Identify new objective sentences during
    bootstrapping

Known subjective vocabulary
Extraction Pattern Learner
objective sentences
Objective Classifier
objective sentences
unlabeled sentences
Pattern-Based Objective Classifier
53
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
54
Unannotated Text Collection
unlabeled sentences
Subjective Classifier Iteration 0 Iteration 1
subjective sentences
Known subjective vocabulary
55
  • Build up subjective lexicon as the process
  • is applied to new corpora.
  • Human review of high precision patterns
  • Tough act to follow linguistic subjectivity
  • Rush Limbaugh opinionated source
  • police lightning rod topic

Known subjective vocabulary
  • Richer Representation with deeper knowledge
  • (theta roles, polarity, tone, ambiguity,)

56
Conclusions
  • High-precision subjectivity classification can be
    used to generate large amounts of labeled
    training data
  • Extraction pattern learning techniques can learn
    linguistically rich subjective patterns
  • Bootstrapping process results in higher recall
    with little loss in precision

57
Annotation Scheme
  • The annotation scheme was developed as part of a
    U.S. government-sponsored project (ARDA AQUAINT
    NRRC) to investigate multiple perspective
    question answering.
  • Annotators labeled private state expressions.
  • Each private state can have low, medium, or high
    strength.
  • Our gold standard considers a sentence to be
    subjective if it contains at least one private
    state expression of medium or higher strength.

58
Two Ways of Expressing Private States
  • Explicit mentions of private states and speech
    events
  • The United States fears a spill-over from the
    anti-terrorist campaign
  • Expressive subjective elements
  • The part of the US human rights report about
    China is full of absurdities and fabrications.

59
Nested Sources
60
OnlyFactive
OnlyFactiveyes
The US fears a spill-over, said Xirao-Nima, a
professor of foreign affairs at the Central
University for Nationalities.
61
Example
The Foreign Ministry said Thursday that it was
surprised, to put it mildly
by the U.S. State Departments criticism of
Russias human rights
record and objected in particular to the odious
section on Chechnya.
62
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
63
Unannotated Text Collection
unlabeled sentences
subjective patterns
Subjective Classifier
subjective sentences
Known subjective vocabulary
unlabeled sentences
Extraction Pattern Learner
objective sentences
Objective Classifier
subjective sentences
subjective patterns
unlabeled sentences
Pattern-Based Subjective Classifier
Write a Comment
User Comments (0)
About PowerShow.com