Finding strong and weak opinion clauses - PowerPoint PPT Presentation

About This Presentation

Title:

Finding strong and weak opinion clauses

Description:

Supervised learning to classify strength of clauses. NO OPINION VERY STRONG ... Detailed expression-level annotations of private states: strength ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 33

Provided by: csP4

Learn more at: https://people.cs.pitt.edu

Category:

more less

Transcript and Presenter's Notes

Title: Finding strong and weak opinion clauses

1
Finding strong and weak opinion clauses
Just how mad are you?

Theresa Wilson, Janyce Wiebe, Rebecca Hwa
University of Pittsburgh

AAAI-2004
2
Problem and Motivation

Problem Opinion Extraction
Automatically identify and extract attitudes,
opinions, sentiments in text
Applications
Information Extraction, Summarization, Question
Answering, Flame Detection, etc.
Focus
Individual clauses and strength

3
Motivating Example
I think people are happy because Chavez has
fallen. But theres also a feeling of
uncertainty about how the countrys obvious
problems are going to be solved, said Ms.
Ledesma.
4
Motivating Example
medium strength
Though some of them did not conceal their
criticisms of Hugo Chavez, the member countries
of the Organization of American States condemned
the coup and recognized the legitimacy of the
elected president.
low strength
high strength
5
Our Goal

Identify opinions below sentence level
Characterize strength of opinions

6
Our Approach

Identify embedded sentential clauses
Dependency tree representation
Supervised learning to classify strength of
clauses
NO OPINION
VERY STRONG
neutral low
medium high
Significant improvements over baseline
Mean-squared error lt 1.0

7
Our Approach
I am furious that my landlord refused to return
my security deposit until I sued them.
am
High Strength
that
I
furious
refused
Opinionated Sentence
return
landlord
until
(Riloff et al. (2003), Riloff and Wiebe (2003))
my
sued
to
deposit
Medium Strength
them
I
security
my
Neutral
8
Outline

Introduction
Opinions and Emotions in Text
Clues and Features
Subjectivity Clues
Organizing Clues into Features
Experiments
Strength Classification
Results
Conclusions

9
Private States and Subjective Expressions

Private state covering term for opinions,
emotions, sentiments, attitudes, speculations,
etc. (Quirk et al., 1985)
Subjective Expressions words and phrases that
express private states (Banfield, 1982)

The US fears a spill-over, said Xirao-Nima.
The report is full of absurdities, he
complained.
10
Corpus of Opinion Annotations

Multi-perspective Question Answering (MPQA)
Corpus
Sponsored by NRRC ARDA
Released November, 2003
http//nrrc.mitre.org/NRRC/publications.htm
Detailed expression-level annotations of private
states strength
See Wilson and Wiebe (SIGdial 2003)

Freely Available
11
Outline

Introduction
Opinions and Emotions in Text
Clues and Features
Subjectivity Clues
Organizing Clues into Features
Experiments
Strength Classification
Results
Conclusions

12
Clues from Previous Work

29 sets of clues
Culled from manually developed resources
Learned from annotated/unannotated data
Words, phrases, extraction patterns

Examples SINGLE WORDS bizarre, hate, concern,
applaud, foolish, vexing PHRASES long for,
stir up, grin and bear it, on the other hand
EXTRACTION PATTERNS expressed
(condolenceshope) show
of (supportgoodwill)
13
Syntax Clues Generation
Parse
think,VBP
head
obj
subj
I,PRP
are,VBP
pred
subj
Convert to dependency
happy,JJ
people,NNS
modifiers
14
Syntax Clues Generation
Dependency Parse Tree
5 Classes of Clues

1. root
2. leaf
3. node
4. all-kids
5. bilex

obj
subj
subj
obj
pred
i
Example bilex(are,VBP,pred,happy,JJ)
Example allkids(fallen,VBN,subj,Chavez,NNP,mod,h
as,VBZ)
subj
mod
15
Syntax Clues Selection
70 instances in subjective expressions in
training data?
YES
NO
Discard
Frequency 5
YES
NO
Highly Reliable
Any instances in AUTOGEN Corpus?
NO
YES
Not Very Reliable
80 instances in subjective sentences?
YES
NO
Parameters chosen on tuning set
Somewhat Reliable
Discard
16
Syntax Clues

15 sets of clues
5 classes
root, leaf, node, bilex, allkids
3 reliability levels
highly reliable, somewhat reliable, not very
reliable

17
Organizing Clues into Features
SET1 believe, happy, sad, think, SET2
although, because, however, SET44
certainly, unlikely, maybe,
S1 I think people are happy because Chavez has
fallen
Input to Machine Learning Algorithm
18
Organizing Clues by Strength
Training Data
NEUTRAL_SET however, LOW_SET because,
maybe, think, unlikely, MEDIUM_SET believe,
certainly, happy, sad, HIGH_SET condemn,
hate, tremendous,
S1 I think people are happy because Chavez has
fallen
Input to Machine Learning Algorithm
19
Clues and Features Summary

Many Types/Sets of Subjectivity Clues
29 from previous work
15 new syntax clues
TYPE features correspond to type sets
44 features
STRENGTH features correspond to strength sets
4 features (neutral, low, medium, high)

20
Outline

Introduction
Opinions and Emotions in Text
Clues and Features
Subjectivity Clues
Organizing Clues into Features
Experiments
Strength Classification
Results
Conclusions

21
Approaches to Strength Classification

Target Classes neutral, low, medium, high

Classification
Regression

Boosting
BoosTexter
(Schapire and Singer, 2000)
AdaBoost.HM
1000 rounds of boosting

Support Vector Regression
SVMlight (Joachims, 1999)
Discretize output into ordinal strength
classes

22
Approaches to Strength Classification Evaluation

Target Classes neutral, low, medium, high

Classification
Regression
Accuracy
Mean-Squared Error
total correct
1
N
N
23
Units of Classification
Level 1
Train
think,VBP
I,PRP
are,VBP
Level 2
Test
happy,JJ
people,NNS
because,IN
fallen,VBN
Level 3
Chavez,NNP
has,VBZ
24
Gold-standard Classes
Level 1 medium
low
think,VBP
I,PRP
are,VBP
Level 2 medium
happy,JJ
people,NNS
because,IN
fallen,VBN
Level 3 neutral
medium
Chavez,NNP
has,VBZ
25
Overview of Results

10-fold cross validation over 9313 sentences
Bag-of-words (BAG)
Best Results All Clues Bag-of-words
Boosting
MSE 48 to 60 improvement over baseline
Accuracy 23 to 79 improvement over baseline
Support Vector Regression
MSE 57 to 64 improvement over baseline
Baseline most frequent class

26
Results Mean-Squared Error
SVM
Boosting
1.6
STRENGTH BAG
STRENGTH Features
TYPE Features
BAG (Bag-of-words)
Improvements over BASELINE SVM 57 - 64
Boosting 48 to 60
Clause Level
BASELINE 1.9 to 2.5
27
Results Accuracy
SVM
Boosting
STRENGTH BAG
STRENGTH Features
TYPE Features
BAG (Bag-of-words)
Improvements over BASELINE SVM 57 clause
level 1 Boosting 23 to 79
Clause Level
BASELINE 30.8 to 48.3
28
Removing Syntax Clues MSE
SVM
Boosting
MSE
Clause Level
All Clues
MINUS Syntax Clues
29
Removing Syntax Clues Accuracy
SVM
Boosting
Accuracy
Clause Level
All Clues
MINUS Syntax Clues
30
Related Work

Types of Attitude Gordon et al. (2003), Liu
et al. (2003)
Tracking sentiment timelines
Tong (2001)
Positive/Negative Language
Pang et al. (2002), Morinaga et al. (2002),
Turney and Littman (2003), Yu and
Hatzivassiloglou (2003), Dave et al. (2003),
Nasukawa and Yi (2003), Hu and Liu (2004)
Public sentiment in message boards and stock
prices
Das and Chen (2001)

31
Conclusions