MRI: Meaningful Interpretations of Collaborative Ratings - PowerPoint PPT Presentation

About This Presentation

Title:

MRI: Meaningful Interpretations of Collaborative Ratings

Description:

MRI: Meaningful Interpretations of Collaborative Ratings Mahashweta Das Sihem Amer-Yahia Cong Yu Gautam Das 37th International Conference on Very Large Data ... – PowerPoint PPT presentation

Number of Views:74

Avg rating:3.0/5.0

Slides: 47

Provided by: Mahashw

Learn more at: https://www.vldb.org

Category:

more less

Transcript and Presenter's Notes

Title: MRI: Meaningful Interpretations of Collaborative Ratings

1
MRI Meaningful Interpretations of Collaborative
Ratings

Mahashweta Das Sihem Amer-Yahia Cong
Yu
Gautam Das

37th International Conference on Very Large Data
Bases, 2011 _at_ Seattle
2
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

3
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

4
Motivation
5
Motivation
6
Motivation
7
Motivation

Examining reviews vs. trusting overall aggregate
rating
IMDB ratings demographic breakdown not meaningful
enough

8
MRI Problem

Examining reviews vs. trusting overall aggregate
rating
IMDB ratings demographic breakdown not meaningful
enough
Novel and powerful third option Meaningful
Rating Interpretation
Explain ratings by leveraging user and item
attribute information

9
MRI Problem

Examining reviews vs. trusting overall aggregate
rating
IMDB ratings demographic breakdown not meaningful
enough
Novel and powerful third option Meaningful
Rating Interpretation
Explain ratings by leveraging user and item
attribute information
Example

10
MRI Problem

Examining reviews vs. trusting overall aggregate
rating
IMDB ratings demographic breakdown not meaningful
enough
Novel and powerful third option Meaningful
Rating Interpretation
Explain ratings by leveraging user and item
attribute information
Example

11
MRI Sub-problem

DEM Meaningful Description Mining
Identify groups of reviewers who consistently
share similar ratings on items

12
MRI Sub-problem

DEM Meaningful Description Mining
Identify groups of reviewers who consistently
share similar ratings on items

13
MRI Sub-problem

DIM Meaningful Difference Mining
Identify groups of reviewers who consistently
disagree on item ratings

14
MRI Sub-problem

DIM Meaningful Difference Mining
Identify groups of reviewers who consistently
disagree on item ratings

15
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

16
Data Model

Collaborative rating site ltSet of Items, Set of
Users, Ratingsgt
Rating tuple ltitem attributes,
user attributes, ratinggt
Group Set of ratings describable by a set of
attribute values
Notion of group based on data cube
OLAP literature for mining multidimensional data

ID Title Genre Director Name Gender Location Rating
1 Titanic Drama James Cameron Amy Female New York 8.5
2 Schindlers List Drama Steven Speilberg John Male New York 7.0
17
Data Model

Notion of group based on data cube lattice

Each node in lattice is a data cube/cuboid
Query condition on database
Figure 4-Dimensional Data Cube Lattice
18
Data Model

Notion of group based on data cube lattice

Each node in lattice is a data cube/cuboid
Query condition on database
A Gender B Age C Location D Occupation
Figure 4-Dimensional Data Cube Lattice
19
Data Model
Each node/data cube/ cuboid in lattice is a group
Selection Query Condition
A Gender Male B Age Young C Location
CA D Occupation Student
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
20
Data Model
Each node/data cube/ cuboid in lattice is a group
Selection Query Condition
A Gender Male B Age Young C Location
CA D Occupation Student
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
21
Data Model
Task Quickly indentify good groups in the
lattice that help users understand ratings
effectively
Figure Partial Rating Lattice for a
Movie (MMale, YYoung, CACalifornia, SStudent)
22
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

23
DEM Meaningful Description Mining

For an input item covering RI ratings, return set
C of cuboids, such that
description error is
minimized, subject to
C k
coverage a
Description Error
Measures how well a cuboid average rating
approximates the numerical score of each
individual rating belonging to it
Coverage
Measures the percentage of ratings covered by
the returned cuboids
DEM is NP-Hard Proof details in paper

24
DEM Algorithms

Exact Algorithm (E-DEM)
Brute-force enumerating all possible combinations
of cuboids in lattice to return the exact (i.e.,
optimal) set as rating descriptions
Random Restart Hill Climbing Algorithm
Often fails to satisfy Coverage constraint Large
number of restarts required
Need an algorithm that optimizes both Coverage
and Description Error constraints simultaneously
Randomized Hill Exploration Algorithm (RHE-DEM)

25
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
26
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
Say,C does not satisfy Coverage Constraint
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
27
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
C Male, Student California, Student
C Male California,Student
C Student California,Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
28
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Say, C satisfies Coverage Constraint
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
29
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
30
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
C Male California, Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
31
RHE-DEM Algorithm
Satisfy Coverage Minimize Error
v
v
C Male Student
Figure Partial Rating Lattice for a Movie k2,
a80 (MMale, YYoung, CACalifornia, SStudent)
32
DIM Meaningful Difference Mining

For an input item covering RI RI- ratings,
return set C of cuboids, such that
difference balance
is minimized, subject to
C k
a n
a
Difference Balance
Measures whether the positive and negative
ratings are mingled together" (high balance) or
separated apart" (low balance)
Coverage
Measures the percentage of , - ratings covered
by the returned cuboids
DIM is NP-Hard Proof details in paper

33
DIM Algorithms

Exact Algorithm (E-DIM)
Randomized Hill Exploration Algorithm (RHE-DIM)
Unlike DEM error, DIM balance computation is
expensive
Quadratic computation scanning all possible
positive and negative ratings for each set of
cuboids
Introduce the concept of Fundamental Regions to
aid faster balance computation
Partition space of all ratings and aggregate
rating tuples in each region

34
DIM Algorithms Fundamental Region
C1 Male, Student C2 California, Student
Balance
Figure Computing Balance using Fundamental
Region Set of k2 cuboids having
75 ratings (44, 31-),10 ratings (6, 4-)
35
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

36
Experiments

Dataset
MovieLens100,000 ratings for 1682 movies by 943
users
Each user has 4 attributes Gender, Age,
Occupation, Location
Binning the movies Order movies according to
number of ratings and then partition into 6 bins
Bin 1 movies with fewest ratings, Bin 6 movies
with highest ratings
Evaluation
Quantitative Indicator Efficiency, Quality and
Scalability
Qualitative Indicator Mechanical Turk User
Study

37
Quantitative Experiments DEM
38
Quantitative Experiments DEM
39
Qualitative Experiments User Study

Amazon Mechanical Turk study
Two sets one for description mining, one for
difference mining
Each set 4 randomly chosen movies, 30
independent single-
user tasks
Study 1 Users prefer simple aggregate ratings
over rating
interpretations
Study 2 Users prefer rating interpretations by
exact algorithm or
heuristic randomized hill
exploration algorithm

40
Qualitative Experiments User Study
41
Roadmap

Introduction
Motivation
Problem MRI
Sub problem DEM
Sub problem DIM
Data Model
Algorithms
Experiments
Quantitative
Qualitative
Conclusion Future Work

42
Conclusion and Future Work

Novel problem of meaningful rating interpretation
(MRI) in collaborative rating sites
Meaningful Description Mining
Meaningful Difference Mining
Heuristic algorithmic solutions that generate
equally good rating interpretations as exact
brute-force with much less execution time
Meaningful interpretations of ratings by
reviewers of interest
Additional constraints such as diversity of
rating explanations

43
Related Work

Data Cubes
Gray et. al, A relational aggregation operator
generalizing group-by, cross-tab, and sub-totals,
ICDE 1996
Sathe et. al, Intelligent rollups in
multidimensional olap data, VLDB 2001
Lakshmanan et. al, Quotient cube how to
summarize the semantics of a data cube, VLDB 2002
Ramakrishnan et. al, Exploratory mining in cube
space, ICDM 2006
Wu et. al, Promotion analysis in
multi-dimensional space, VLDB 2009
Clustering Dimensionality Reduction
Agrawal et. al, Automatic subspace clustering of
high dimensional data for data mining
applications, SIGMOD 1998
Recommendation Explanation
Herlocker et. al, Explaining collaborative
filtering recommendations, CSCW 2000
Bilgic et. al, Explaining recommendations
Satisfaction vs. promotion, IUI 2005