Title: Is it all about Connections
1Is it all about Connections?
- Evaluating a Link-Based
- Recommender System
http//video.ils.unc.edu/recex
Miles Efron (efrom_at_ils.unc.edu)
Gary Geisler (geisg_at_ils.unc.edu)
2Is it all about Connections?
Evaluating a Link-Based Recommender System
How well does our method work?
What factors make recommendation difficult?
3Recommendation Explorer (RecEx)
- Currently recommends films from a database of
12,726 popular films
- Comprised of several modules
- This study is concerned with the film-film
similarity module
4(No Transcript)
5Implied Relationship
Terminator
Star Wars
Alien
6Singular Value Decomposition
S
T
D
A
7Singular Value Decomposition
Sk
Dk
Ak
Tk
8RecEx Similarity Model
That Tk Sk
In the space defined by That where
9Experimental Evaluation
- N 133 volunteer reviewers
- 10 seed films
- For each seed, reviewers create a key. i.e. The
right answer for a given seed.
- Based on the key, we calculate precision/recall.
10Experimental Evaluation
Average Precision of recommendations based on raw
link structure and links analyzed by SVD
11Experimental Evaluation
preci ß0 ß1(ini) ß2(outi) ß3(out/ini)
ei
preci avg. precision for ith seed film
ini number of films pointing to ith seed film
outi number of films pointed to by ith seed
film
Does the SVD-based similarity model privilege
well-connected seeds or candidates?
12Experimental Evaluation
preci ß0 ß1(ini) ß2(outi) ß3(out/ini)
ei
Does the SVD-based similarity model privilege
well-connected seeds or candidates?
Probably not R2 0.119 p-value for H0 0.845
13Experimental Evaluation
preci ß0 ß1(reviewersi) ß2(totali)
ß3(uniquei) ei
preci avg. precision for ith seed film
reviewersi number of reviewers for ith seed
film totali number of reviews made for ith see
d film uniquei num. of unique titles recd for
ith seed
Does the amount of reviewer consensus bear on
average precision?
14Experimental Evaluation
preci ß0 ß1(reviewersi) ß2(totali)
ß3(uniquei) ei
Does the amount of reviewer consensus bear on
average precision?
Probably so R2 0.64 p-value for H0 0.034
15Experimental Evaluation
preci ß0 ß1(reviewersi) ß2(totali)
ß3(uniquei) ei
Maybe so For 1000 bootstrap samples, computed R2
Mean R2 0.793 s(R2 ) 0.161
With such a small N, can we trust R2?
16The English PatientSome films are hard to
Recommend for
- Reviewers similarity criteria were abstract
- Adaptations of romantic novels
- Period pieces
- Ralph Fiennes
- i.e. Peoples motivations for linking two films
varied widely. What constitutes a good
recommendation depends on the factors that are
important to the reviewer.
17How to cope with hard Items
- Tightly coupled algorithms and interface
- Lightweight profile specification and revision
18Flexible, Informative Interface
- Graphical or text views of results
- Users can filter and explore results
- Immediate preview
19Summary
- The number of incoming and outgoing links did not
heavily effect the quality of similarity
judgments in SVD space.
- Quality of similarity judgments was strongly
effected by the degree of reviewer consensus on a
given seed.
- Bootstrap analysis of modeling suggests that this
effect is worth pursuing in a larger study.
20Questions
- Are there other factors (i.e. besides connections
and consensus) that should be taken into
account?
- What does it mean for an item to be difficult
in the recommendation setting?
- Difficult to use as evidence? Difficult to
retrieve?
- Difficult for the machine vs. difficult for
humans to judge
- How should a recommender system behave when it
encounters difficult items?