Finding strong and weak opinion clauses - PowerPoint PPT Presentation

About This Presentation
Title:

Finding strong and weak opinion clauses

Description:

Supervised learning to classify strength of clauses. NO OPINION VERY STRONG ... Detailed expression-level annotations of private states: strength ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 33
Provided by: csP4
Category:

less

Transcript and Presenter's Notes

Title: Finding strong and weak opinion clauses


1
Finding strong and weak opinion clauses
Just how mad are you?
  • Theresa Wilson, Janyce Wiebe, Rebecca Hwa
  • University of Pittsburgh

AAAI-2004
2
Problem and Motivation
  • Problem Opinion Extraction
  • Automatically identify and extract attitudes,
    opinions, sentiments in text
  • Applications
  • Information Extraction, Summarization, Question
    Answering, Flame Detection, etc.
  • Focus
  • Individual clauses and strength

3
Motivating Example
I think people are happy because Chavez has
fallen. But theres also a feeling of
uncertainty about how the countrys obvious
problems are going to be solved, said Ms.
Ledesma.
4
Motivating Example
medium strength
Though some of them did not conceal their
criticisms of Hugo Chavez, the member countries
of the Organization of American States condemned
the coup and recognized the legitimacy of the
elected president.
low strength
high strength
5
Our Goal
  • Identify opinions below sentence level
  • Characterize strength of opinions

6
Our Approach
  • Identify embedded sentential clauses
  • Dependency tree representation
  • Supervised learning to classify strength of
    clauses
  • NO OPINION
    VERY STRONG
  • neutral low
    medium high
  • Significant improvements over baseline
  • Mean-squared error lt 1.0

7
Our Approach
I am furious that my landlord refused to return
my security deposit until I sued them.
am
High Strength
that
I
furious
refused
Opinionated Sentence
return
landlord
until
(Riloff et al. (2003), Riloff and Wiebe (2003))
my
sued
to
deposit
Medium Strength
them
I
security
my
Neutral
8
Outline
  • Introduction
  • Opinions and Emotions in Text
  • Clues and Features
  • Subjectivity Clues
  • Organizing Clues into Features
  • Experiments
  • Strength Classification
  • Results
  • Conclusions

9
Private States and Subjective Expressions
  • Private state covering term for opinions,
    emotions, sentiments, attitudes, speculations,
    etc. (Quirk et al., 1985)
  • Subjective Expressions words and phrases that
    express private states (Banfield, 1982)

The US fears a spill-over, said Xirao-Nima.
The report is full of absurdities, he
complained.
10
Corpus of Opinion Annotations
  • Multi-perspective Question Answering (MPQA)
    Corpus
  • Sponsored by NRRC ARDA
  • Released November, 2003
  • http//nrrc.mitre.org/NRRC/publications.htm
  • Detailed expression-level annotations of private
    states strength
  • See Wilson and Wiebe (SIGdial 2003)

Freely Available
11
Outline
  • Introduction
  • Opinions and Emotions in Text
  • Clues and Features
  • Subjectivity Clues
  • Organizing Clues into Features
  • Experiments
  • Strength Classification
  • Results
  • Conclusions

12
Clues from Previous Work
  • 29 sets of clues
  • Culled from manually developed resources
  • Learned from annotated/unannotated data
  • Words, phrases, extraction patterns

Examples SINGLE WORDS bizarre, hate, concern,
applaud, foolish, vexing PHRASES long for,
stir up, grin and bear it, on the other hand
EXTRACTION PATTERNS expressed
(condolenceshope) show
of (supportgoodwill)
13
Syntax Clues Generation
Parse
think,VBP
head
obj
subj
I,PRP
are,VBP
pred
subj
Convert to dependency
happy,JJ
people,NNS
modifiers
14
Syntax Clues Generation
Dependency Parse Tree
5 Classes of Clues
  • 1. root
  • 2. leaf
  • 3. node
  • 4. all-kids
  • 5. bilex

obj
subj
subj
obj
pred
i
Example bilex(are,VBP,pred,happy,JJ)
Example allkids(fallen,VBN,subj,Chavez,NNP,mod,h
as,VBZ)
subj
mod
15
Syntax Clues Selection
70 instances in subjective expressions in
training data?
YES
NO
Discard
Frequency 5
YES
NO
Highly Reliable
Any instances in AUTOGEN Corpus?
NO
YES
Not Very Reliable
80 instances in subjective sentences?
YES
NO
Parameters chosen on tuning set
Somewhat Reliable
Discard
16
Syntax Clues
  • 15 sets of clues
  • 5 classes
  • root, leaf, node, bilex, allkids
  • 3 reliability levels
  • highly reliable, somewhat reliable, not very
    reliable

17
Organizing Clues into Features
SET1 believe, happy, sad, think, SET2
although, because, however, SET44
certainly, unlikely, maybe,
S1 I think people are happy because Chavez has
fallen
Input to Machine Learning Algorithm
18
Organizing Clues by Strength
Training Data
NEUTRAL_SET however, LOW_SET because,
maybe, think, unlikely, MEDIUM_SET believe,
certainly, happy, sad, HIGH_SET condemn,
hate, tremendous,
S1 I think people are happy because Chavez has
fallen
Input to Machine Learning Algorithm
19
Clues and Features Summary
  • Many Types/Sets of Subjectivity Clues
  • 29 from previous work
  • 15 new syntax clues
  • TYPE features correspond to type sets
  • 44 features
  • STRENGTH features correspond to strength sets
  • 4 features (neutral, low, medium, high)

20
Outline
  • Introduction
  • Opinions and Emotions in Text
  • Clues and Features
  • Subjectivity Clues
  • Organizing Clues into Features
  • Experiments
  • Strength Classification
  • Results
  • Conclusions

21
Approaches to Strength Classification
  • Target Classes neutral, low, medium, high

Classification
Regression
  • Boosting
  • BoosTexter
  • (Schapire and Singer, 2000)
  • AdaBoost.HM
  • 1000 rounds of boosting
  • Support Vector Regression
  • SVMlight (Joachims, 1999)
  • Discretize output into ordinal strength
    classes

22
Approaches to Strength Classification Evaluation
  • Target Classes neutral, low, medium, high

Classification
Regression
Accuracy
Mean-Squared Error
total correct
1
N
N
23
Units of Classification
Level 1
Train
think,VBP
I,PRP
are,VBP
Level 2
Test
happy,JJ
people,NNS
because,IN
fallen,VBN
Level 3
Chavez,NNP
has,VBZ
24
Gold-standard Classes
Level 1 medium
low
think,VBP
I,PRP
are,VBP
Level 2 medium
happy,JJ
people,NNS
because,IN
fallen,VBN
Level 3 neutral
medium
Chavez,NNP
has,VBZ
25
Overview of Results
  • 10-fold cross validation over 9313 sentences
  • Bag-of-words (BAG)
  • Best Results All Clues Bag-of-words
  • Boosting
  • MSE 48 to 60 improvement over baseline
  • Accuracy 23 to 79 improvement over baseline
  • Support Vector Regression
  • MSE 57 to 64 improvement over baseline
  • Baseline most frequent class

26
Results Mean-Squared Error
SVM
Boosting
1.6
STRENGTH BAG
STRENGTH Features
TYPE Features
BAG (Bag-of-words)
Improvements over BASELINE SVM 57 - 64
Boosting 48 to 60
Clause Level
BASELINE 1.9 to 2.5
27
Results Accuracy
SVM
Boosting
STRENGTH BAG
STRENGTH Features
TYPE Features
BAG (Bag-of-words)
Improvements over BASELINE SVM 57 clause
level 1 Boosting 23 to 79
Clause Level
BASELINE 30.8 to 48.3
28
Removing Syntax Clues MSE
SVM
Boosting
MSE
Clause Level
All Clues
MINUS Syntax Clues
29
Removing Syntax Clues Accuracy
SVM
Boosting
Accuracy
Clause Level
All Clues
MINUS Syntax Clues
30
Related Work
  • Types of Attitude Gordon et al. (2003), Liu
    et al. (2003)
  • Tracking sentiment timelines
    Tong (2001)
  • Positive/Negative Language
  • Pang et al. (2002), Morinaga et al. (2002),
    Turney and Littman (2003), Yu and
    Hatzivassiloglou (2003), Dave et al. (2003),
    Nasukawa and Yi (2003), Hu and Liu (2004)
  • Public sentiment in message boards and stock
    prices
  • Das and Chen (2001)

31
Conclusions
  • Promising results
  • MSE under 0.80 for sentences
  • MSE near 1 for embedded clauses
  • Embedded clauses more difficult
  • less information
  • Wide range of features produces best results
  • syntax clues
  • Organizing features by strength is useful

32
Thank you!
  • MPQA Corpus
  • http//nrrc.mitre.org/NRRC/publications.htm
Write a Comment
User Comments (0)
About PowerShow.com