Title: Giuseppe Carenini, Raymond T. Ng,
1Multi-Document Summarization of Evaluative Text
- Giuseppe Carenini, Raymond T. Ng,
- Adam Pauls
- Computer Science Dept.
- University of British Columbia
- Vancouver, CANADA
2Multi-Document Summarization of Evaluative Text
- Giuseppe Carenini, Raymond T. Ng,
- Adam Pauls
- Computer Science Dept.
- University of British Columbia
- Vancouver, CANADA
3Motivation and Focus
- Large amounts of info expressed in text form is
constantly produced - News, Reports, Reviews, Blogs, Emails.
- Pressing need to summarize
- Considerable work but limited factual info
4Our Focus
- Evaluative documents (good vs. bad, right vs.
wrong) about a single entity - Customer reviews (e.g. Amazon.com)
- Travel logs about a destination
- Teaching evaluations
- User studies (!)
- .
- .
- .
5Our Focus
The Canon G3 is a great camera. . .
Most users liked the Canon G3. Even though some
did not like the menus, many . . .
Though great, the G3 has bad menus. . .
I love the Canon G3! It . . .
6Two Approaches
- Automatic summarizers generally produce two types
of summaries - Extracts A representative subset of text from
the original corpus - Abstracts Generated text which contains the most
relevant info from the original corpus
7Two Approaches (cont'd)
- Extracts-based summarizers generally fare better
for factual summarization (c.f. DUC 2005) - But extracts aren't well suited to capturing
evaluative info - Can't express distribution of opinions
(some/all) - Can't aggregate opinions either numerically or
conceptually - So we tried both
8Two Approaches (cont'd)
- Extract-based approach (MEAD)
- Based on MEAD (Radev et al. 2003) framework for
summarization - Augmented with knowledge of evaluative info (I'll
explain later) - Abstract-based (SEA)
- Based on GEA (Carenini Moore, 2001) framework
for generating evaluative arguments about an
entity
9Pipeline Approach (for both)
Shared
Organization
10Extracting evaluative info
- We adopt previous work of Hu Liu (2004) (but
many others exist . . .) - Their approach extracts
- What features of the entity are evaluated
- The strength and polarity of the evaluation on
the -3 .. 3 interval - Approach is (mostly) unsupervised
11Examples
- the menus are easy to navigate and the buttons
are easy to use. it is a fantastic camera - the canon computer software used to download ,
sort , . . . is very easy to use. the only two
minor issues i have with the camera are the lens
cap ( it is not very snug and can come off too
easily). . . .
12Feature Discovery
- the menus are easy to navigate and the buttons
are easy to use. it is a fantastic camera - the canon computer software used to download
, sort , . . . is very easy to use. the only two
minor issues i have with the camera are the lens
cap ( it is not very snug and can come off too
easily). . . .
13Strength/Polarity Determination
- the menus are easy to navigate(2) and the
buttons are easy to use(2). it is a
fantastic(3) camera - the canon computer software used to download
, sort , . . . is very easy to use (3). the only
two minor issues i have with the camera are the
lens cap ( it is not very snug (-2) and can come
off too easily (-2))...
14Pipeline Approach (for both)
Shared
Organization
Partially shared
15Organizing Extracted Info
- Extraction provides a bag of features
- But
- features are redundant
- features may range from concrete and specific
(e.g. resolution) to abstract and general (e.g.
image) - Solution map features to a hierarchy Carenini,
Ng, Zwart 2005
16Feature Ontology
canon
canon g3
digital camera
Canon G3 Digital Camera
User Interface
Convenience
. . .
1
Menu
Battery
Buttons
Menus
Lever
1
2,2,2,33
Battery Life
Battery Charging System
-1,-1,-2
. . .
17Organization SEA vs. MEAD
- SEA operates only on the hierarchical data and
forgets about raw extracted features - MEAD operates on the raw extracted features and
only uses hierarchy for sentence ordering (I'll
come back to this)
18Pipeline Approach (for both)
Shared
Organization
Partially shared
Not shared
19Feature Selection SEA
psk
Canon G3 Digital Camera
User Interface
Convenience
1
20Selection Procedure
- Straightforward greedy selection would not work
- if a node derives most of its importance from
its child(ren) including both the node and the
child(ren) would be redundant
Similar to redundancy reduction step in many
automatic summarization algorithms
21Feature Selection MEAD
- MEAD selects sentences, not features
- Calculate score for each sentence si with
the menus are easy to navigate(2) and the
buttons are easy to use(2).
feature(si)
psk
- Break ties with MEAD centroid (common feature in
multi-document summarization)
22Feature Selection MEAD
- We want to extract sentences for most important
features, and only one sentence per feature - Put each sentence in bucket for each feature(si)
I like the menus . . .
the menus are easy to navigate(2 ) and the
buttons are easy to use(2 ).
23Feature Selection MEAD
- Take the (single) highest scoring sentence from
the fullest buckets until desired summary
length is reached
24Pipeline Approach (for both)
Shared
Organization
Partially shared
Not shared
Not shared
25Presentation MEAD
- Display selected sentences in order from most
general (top of feature hierarchy) to most
specific - That's it!
26Presentation SEA
- SEA (Summarizer of Evaluative Arguments) is based
on GEA
(Generator of Evaluative Arguments) (Carenini
Moore, 2001) - GEA takes as input
- a hierarchical model of features for an entity
- objective values (good vs. bad) for each feature
of the entity - Adaptation is (in theory) straightforward
27Possible GEA Output
- The Canon G3 is a good camera. Although the
interface is poor, the image quality is
excellent.
28Target SEA Summary
- Most users thought Canon G3 was a good camera.
Although, several users did not like interface,
almost all users liked the image quality.
29Extra work
- What GEA gives us
- High-level text plan (i.e. content selection and
ordering) - Cue phrases for argumentation strategy (In
fact, Although, etc.) - What GEA does not give us
- Appropriate micro-planning (lexicalization)
- Need to give indication of distribution of
customer opinions
30Microplanning (incomplete!)
- We generate one clause for each selected feature
- Each clause includes 3 key pieces of information
- Distribution of customers who evaluated the
feature (Many, most, some etc.) - Name of the feature (menus, image quality,
etc.) - Aggregate of opinions (excellent, fair,
poor, etc.)
- most users found the menus to be poor
31Microplanning
- Distribution is (roughly) based on fraction of
customers who evaluated the feature (
disagreement . . . ) - Name of the feature is straightforward
- Aggregate of opinions is based on a function
similar in form to the measure of importance - average polarity/strength over all evaluations
rather than summing
32Microplanning
- We glue clauses together using cue phrases from
GEA - Also perform basic aggregation
33Formative Evaluation
- Goal test users perceived effectiveness
- Participants 28 ugrad students
- Procedure
- Pretend worked for manufacturer
- Given 20 reviews (from either Camera or DVD
corpus) and asked to generate summary (100
words) for marketing dept - After 20 mins, given a summary of the 20 reviews
- Asked to fill out questionnaire assessing summary
effectiveness (multiple choice and open form)
34Formative Evaluation (cont'd)
- Conditions User given one of 4 summaries
- Topline summary (human)
- Baseline summary (vanilla MEAD)
- MEAD summary
- SEA summary
35Quantitative Results
- Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
36Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
37Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
38Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
39Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
40Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
41Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
42Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
43Quantitative Results
Responses on a scale from 1 (Strongly disagree)
to 5 (Strongly agree)
44Qualitative Results MEAD
- Surprising many participants didn't notice or
didn't mind verbatim text extraction - Two major complaints about content
- Summary was not representative (negative sentence
extracted even though majority were positive) - Evaluations of some features were repeated
- (2) could be addressed, but (1) can only
partially be fixed with pure extraction
45Qualitative Results SEA
- Some complaints about robotic feel of summary,
and about repetition/lack of pronouns - Need to do more complex microplanning
- Some wanted more details (which manual features
. . . ) - Note this complaint absent with MEAD
- Some disagreed with feature selection
(precision/recall), but this is a problem even
with human summaries
46Conclusions
- Extraction works surprisingly well even for
evaluative summarization - Topline MEAD _at_ SEA Baseline
- Need to combine strengths of SEA and MEAD for
evaluative summarization - Need detail, variety, and natural-sounding text
provided by extraction - Need to generate opinion distributions
- Need argument structure from SEA (?)
47Other Future Work
- Automatically induce feature hierarchy
- Produce summaries tailored to user preferences of
the evaluated entity - Summarize corpora of evaluative documents about
more than one entity
48Examples
- MEAD Bottom line , well made camera , easy to
use, very flexible and powerful features to
include the ability to use external flash and
lense / filters choices . It has a beautiful
design , lots of features, very easy to use ,
very configurable and customizable , and the
battery duration is amazing! Great colors ,
pictures and white balance. The camera is a dream
to operate in automode , but also gives
tremendous flexibility in aperture priority ,
shutter priority, and manual modes . I d highly
recommend this camera for anyone who is looking
for excellent quality pictures and a combination
of ease of use and the flexibility to get
advanced with many options to adjust if you like.
49Examples
- SEA Almost all users loved the Canon G3 possibly
because some users thought the physical
appearance was very good. Furthermore, several
users found the manual features and the special
features to be very good. Also, some users liked
the convenience because some users thought the
battery was excellent. Finally, some users found
the editing/viewing interface to be good despite
the fact that several customers really disliked
the viewfinder . However, there were some
negative evaluations. Some customers thought the
lens was poor even though some customers found
the optical zoom capability to be excellent. - Most customers thought the quality of the images
was very good.
50Examples
- MEAD I am a software engineer and am very keen
into technical details of everything i buy , i
spend around 3 months before buying the digital
camera and i must say , g3 worth every single
cent i spent on it . I do nt write many reviews
but i m compelled to do so with this camera . I
spent a lot of time comparing different cameras ,
and i realized that there is not such thing as
the best digital camera . I bought my canon g3
about a month ago and i have to say i am very
satisfied .
51Examples
- Human The Canon G3 was received exceedingly
well. Consumer reviews from novice photographers
to semi-professional all listed an impressive
number of attributes, they claim makes this
camera superior in the market. Customers are
pleased with the many features the camera offers,
and state that the camera is easy to use and
universally accessible. Picture quality, long
lasting battery life, size and style were all
highlighted in glowing reviews. One flaw in the
camera frequently mentioned was the lens which
partially obstructs the view through the view
finder, however most claimed it was only a minor
annoyance since they used the LCD screen.
52Microplanning
- We glue clauses together using cue phrases from
GEA - Although, however, etc. indicate opposing
evidence - Because, in particular, indicate supporting
evidence - Furthermore indicates elaboration
- Also perform basic aggregation
most users found the menus to be poor
most users found the buttons to be poor
most users found the menus and buttons to be poor