Introduction to Recommender Systems

About This Presentation

Title:

Introduction to Recommender Systems

Description:

When there are 100.000.000 of options it is obvious we need tools for searching, ... this predicted rating a set of top N products are recommended. 31. Nearest ... – PowerPoint PPT presentation

Number of Views:2496

Avg rating:3.0/5.0

Slides: 89

Provided by: frances105

Category:

more less

Transcript and Presenter's Notes

Title: Introduction to Recommender Systems

1
Introduction to Recommender Systems

Adapted from
FRANCESCO RICCI
eCommerce and Tourism Research Laboratory
Automated Reasoning Systems Division
ITC-irst
Trento Italy
ricci_at_itc.it

2
What movie should I see?
The Internet Movie Database (IMDb) provides
information about actors, films, television
shows, television stars, video games and
production crew personnel. Owned by Amazon.com
since 1998, as of June 21, 2006 IMDb featured
796,328 titles and 2,127,371 people.
3
What travel should I do ?

I would like to escape from this ugly an tedious
work life and relax for two weeks in a sunny
place. I am fed up with these crowded and noisy
places just the sand and the sea and some
adventure.

I would like to bring my wife and my children on
a holiday it should not be to expensive. I
prefer mountainous places not to far from home.
Children parks, easy paths and good cuisine are a
must.

I want to experience the contact with a
completely different culture. I would like to be
fascinated by the people and learn to look at my
life in a totally different way.

4
What book should I buy?
5
What news should I read?
6
What paper should I read ?
7
A Solution
8
Information Overload

Internet information overload, i.e., the state
of having too much information to make a decision
or remain informed about a topic
Information retrieval technologies can assist a
user to locate content if the user knows exactly
what he is looking for (with some difficulties!)
The user must be able to say yes this is what I
need when presented with the right result
But in many information search task, e.g.,
product selection, the user is
not aware of the range of available options
may not know what to search
if presented with some results may not be able to
choose.

9
Decisions

When there are 100.000.000 of options it is
obvious we need tools for searching, filtering,
ranking options
But even when there are a few dozen of options we
need support
Examples
Where to go for dinner tonight?
What flight for going to London?
What Digital SLR camera should I buy?

10
Non-personalized tools

A printed catalogue of products, e.g., of books
or clothes or travels
A shop window
A car exhibition in a cars shop
A movie finder data base (search by title or
actor)
A Podcast directory

11
Personalized tools

A printed catalogue of products, e.g., of books
or clothes or travels
THAT SHOWS ONLY THE PRODUCTS THAT YOULL LIKE TO
HAVE
A shop window
THAT SHOWS IN THE BEST PLACES THE PRODUCTS YOURE
SEARCHING
A car exhibition in a cars shop
THAT HAVE EXACTLY THE MODELS YOU THOUGH TO BUY
A movie finder data base (search by title or
actor)
THAT DOES NOT SHOW THE MOVIES YOU HATE
A Podcast directory
THAT LISTS ONLY PODCASTS YOU LIKE

12
Personalization

Personalization is the ability to provide
content and services tailored to individuals
based on knowledge about their preferences and
behavior
Personalization is the capability to customize
customer communication based on knowledge
preferences and behaviors at the time of
interaction with the customer
Personalization is about building customer
loyalty by building a meaningful one-to-one
relationship by understanding the needs of each
individual and helping satisfy a goal that
efficiently and knowledgeably addresses each
individuals need in a given context

13
Personalization Goals

Provide the users with what they want or need
without requiring them to ask explicitly
Not a fully automated process
Different approaches require different levels of
user involvement and human computer interactivity
Personalization techniques try to leverage all
available information about users to deliver a
personal experience

14
Personalization

Context
pre-travel vs. during travel
on-the-net, on-the-move, on-the-tour
traveling, wandering and visiting
environment in the hotel, train, car, airplane,
at the conference
Time/Space coordinates
Business vs. Fun

User Knowledge, experience, budget, travel party,
cognitive capabilities, motivations (security,
variety, fun, etc.), age, language, disabilities

Need
Buy a complete travel package,
choose a restaurant,
collect information on a location,
find the route,
find the transportation mean
communicate with other travelers

Device/Capabilities Personal computer, PDA, smart
phone, phone payment, movies, broadcast,
instant messaging, email, position, etc.
15
Suppliers Motivations

Making interactions faster and easier.
Personalization increases usability, i.e., how
well a web site allows people to achieve their
goals.
Increasing customer loyalty. A user should be
loyal to a web site which, when is visited,
recognizes the old customer and treats him as a
valuable visitor.
Increasing likelihood of repeated visits. The
longer the user interacts with the site, the
more refined his user model maintained by the
system becomes, and the more the web site can be
effectively customized to match user
preferences.
Maximize look-to-buy ratio. It turns out to be
look-to-book ratio in the travel and tourism
industry, which is actually the essential
indicator of personalization objectives in this
domain.

16
Recommender Systems

A recommender system helps to make choices
without sufficient personal experience of the
alternatives
To suggest products to their customers
To provide consumers with information to help
them decide which products to purchase
They are based on a number of technologies
information filtering, machine learning, adaptive
and personalized system, user modeling,

17
Examples

Some examples found in the Web
Amazon.com looks in the user past buying
history, and recommends product bought by a user
with similar buying behavior
Tripadvisor.com - Quoting product reviews of a
community of users
Activebuyersguide.com make questions about
searched benefits to reduce the number of
candidate products
Trip.com make questions and exploits to
constraint the search (exploit standardized
profiles)
Smarter Kids self selection of a user profile
classification of products in user profiles

18
Core Recommendation Techniques
U is a set of users I is a set of items/products
19
Evaluating Recommender Systems

The majority focused on systems accuracy in
supporting the find good items users task
Assumption if a user could examine all items
available, he could place them in a ordering of
preference
Measure how well is the system in predicting the
exact rating value (value comparison)
Measure how well the system can predict whether
the item is relevant or not (relevant vs. not
relevant)
Measure how close the predicted ranking of items
is to the users true ranking (ordering
comparison)

20
How Has Been Measured

Split the available data (so you need to collect
data first!), i.e., the user-item ratings into
two sets training and test
Build a model on the training data
For instance, in a nearest neighbor
(memory-based) CF simply put the ratings in the
training in a separate set
Compare the predicted rating on each test item
(user-item combination) with the actual rating
stored in the test set
You need a metric to compare the predicted and
true rating

21
Accuracy Comparing Values

Measure how close the recommender systems
predicted ratings are to the true user ratings
(for all the ratings in the test set).
Predictive accuracy (rating) Mean Absolute Error
(MAE), pi is the predicted rating and ri is the
true one
Variation 1 Mean Squared Error (take the square
of the differences), root mean squared error (and
then take the square root). These emphasize large
errors.
Variation 2 Normalized MAE MAE divided by the
range of possible ratings allowing comparing
results on different data sets, having different
rating scales.

22
Precision and Recall

Precision is the ratio of relevant items selected
by the recommender to the number of items
selected (Nrs/Ns)
Recall is the ratio of relevant items selected to
the number of relevant (Nrs/Nr)
Precision and recall are the most popular metrics
for evaluating information retrieval systems.

23
Precision and Recall Example

We assume to know the relevance of all the items
in the catalogue for a given user
The orange portion is that recommended by the
system

P4/70.57 R4/90.44
24
Rank Accuracy Metrics

Rank is the position in the sorted list
Spearmans rank correlation (ui and vi are the
ranks of item i in the user and system order)
Kendalls Tau (C is the number of concordant
pairs pairs in the same order in both ranked
lists D is the number of discordant pairs, TR
tied pairs (same ranking) in the user order, TP
tied pair in the predicted order

25
Core Recommendation Techniques
U is a set of users I is a set of items/products
26
Movie Lens
http//movielens.umn.edu
27
(No Transcript)
28
(No Transcript)
29
The CF Ingredients

List of m Users and a list of n Items
Each user has a list of items he/she expressed
their opinion about (can be a null set)
Explicit opinion - a rating score (numerical
scale)
Sometime the rating is implicitly purchase
records
Active user for whom the CF prediction task is
performed
A metric for measuring similarity between users
A method for selecting a subset of neighbors for
prediction
A method for predicting a rating for items not
currently rated by the active user.

30
Collaborative-Based Filtering

The collaborative based filtering recommendation
techniques proceeds in these steps
For a target/active user (the user to whom a
recommendation has to be produced) the set of his
ratings is identified
The users more similar to the target/active user
(according to a similarity function) are
identified (neighborhood formation)
The products bought by these similar users are
identified
For each one of these products a prediction - of
the rating that would be given by the target user
to the product - is generated
Based on this predicted rating a set of top N
products are recommended.

31
Nearest Neighbor Collaborative-Based Filtering
1
User Model interaction history
32
Collaborative-Based Filtering

A collection of user ui, i1, n and a collection
of products pj, j1, , m
A n m matrix of ratings vij , with vij ? if
user i did not rate product j
Prediction for user i and product j is computed
as

Correlation factor can be computed by

The sum (and average) are over j s.t. vij and vkj
are not ?

33
Collaborative-Based Filtering

Pros require minimal knowledge engineering
efforts (knowledge poor)
Users and products are symbols without any
internal structure or characteristics
Cons
Requires a large number of explicit and reliable
rates to bootstrap
Requires products to be standardized (users
should have bought exactly the same product)
Assumes that prior behavior determines current
behavior without taking into account contextual
knowledge (session-level)
Does not provide information about products or
explanations for the recommendations
Does not support sequential decision making or
recommendation of good bundling, e.g., a travel
package.

34
Problems of CF Sparsity

Typically we have large product sets and user
ratings for a small percentage of them
Example Amazon millions of books and a user may
have bought hundreds of books
the probability that two users that have bought
100 books have a common book (in a catalogue of 1
million books) is 0.01 (with 50 and 10 millions
is 0.0002).
We must have a number of users comparable to one
tenth of the size of the product catalogue

35
Problems of CF Scalability

Nearest neighbor algorithms require computations
that grows with both the number of customers and
products
With millions of customers and products a
web-based recommender will suffer serious
scalability problems
The worst case complexity is O(mn) (m customers
and n products)
But in practice the complexity is O(m n) since
for each customer only a small number of products
are considered (one loop on the customers to
compute similarity and one on the products to
compute the prediction)

36
Personalised vs Non-Personalised CF

Collaborative-based recommendations are
personalized since the prediction is based on
the ratings (for a given item) expressed by
similar users
A non-personalized collaborative-based
recommendation can be generated by averaging the
recommendations of ALL the users
How the two approaches would compare?

37
Personalised vs Non-Personalised CF
Not much difference indeed!
vij is the rating of user i for product j and vj
is the average rating for product j
38
Core Recommendation Techniques
U is a set of users I is a set of items/products
39
Content-Based Recommendation

In content-based recommendations, the system
tries to recommend items similar to those a given
user has liked in the past
In contrast with collaborative recommendation
where the system identifies users whose tastes
are similar to those of the given user and
recommends items they have liked
A pure content-based recommender system makes
recommendations for a user based solely on the
profile built up by analyzing the content of
items which that user has rated in the past.

40
Content-Based Recommender

It is mainly used for recommending text-based
products (web pages, usenet news messages, )
The items to recommend are described by their
associated features (e.g. keywords)
The User Model can be structured in a similar
way as the content for instance the
features/keywords more likely to occur in the
preferred documents (lazy approach)
Then, text documents can be recommended based on
a comparison between their content (words
appearing in the text) and a user model (a set of
preferred words)
The user model can also be a classifier based on
whatever technique (Neural Networks, Naïve Bayes,
C4.5, )

41
Syskill Webert

Assisting a person to find information that
satisfies long-term, recurring goals (e.g.
digital photography)
Feedbacks on the interestingness of a set of
previously visited sites is used to learn a
profile
The profile is used to predict interestingness of
unseen sites.

42
Supported Interaction

The user identifies a topic (e.g. Biomedical) and
a page with many links to other pages on the
selected topic (index page)
The user can then explore the Web with a browser
that in addition to showing a page
Offers a tool for collecting user ratings on
displayed pages
Suggests which links on the current page are
(estimated) interesting.

43
Syskill Webert User Interface
The user indicated interest in
The user indicated no interest in
System Prediction
44
Learning

A document (HTML page) is described as a set of
Boolean features (a word is present or not)
They used a Bayesian classifier (for each user),
where the probability that a document w1v1, ,
wnvn (e.g. car1, story0, , price1) belongs
to a class (cold or hot) is
Both P(wj vjChot) (i.e., the probability that
in the set of the documents liked by a user the
word wjis present or not) and P(Chot) is
estimated from the training data
After training on 30/40 examples it can predict
hot/cold with an accuracy between 70 and 80

45
Problems of Content-Based Recommenders

A very shallow analysis of certain kinds of
content can be supplied
Some kind of items are not amenable to any
feature extraction methods with current
technologies (e.g. movies, music)
Even for texts (as Web pages), the IR techniques
cannot consider multimedia information, aesthetic
qualities, download time
Hence if you rate positively a page it could be
not related to the presence of certain keywords!

46
Problems of Content-Based Recommenders

Over-specialization the system can only
recommend items scoring high against a users
profile the user is recommended with items
similar to those already rated
Requires user feed-backs the pure content-based
approach (similarly to CF) requires user feedback
on items in order to provide meaningful
recommendations
It tends to recommend expected items this tends
to increase trust but could make the
recommendation not much useful (no serendipity)
Works better in those situations where the
products are generated dynamically (news,
email, events, etc.) and there is the need to
check if these items are relevant or not.

47
Core Recommendation Techniques
U is a set of users I is a set of items/products
48
Demographic Methods

Aim to categorize the user based on personal
attributes and make recommendation based on
demographic classes
Demographic groups can come from marketing
research hence experts decided how to model the
users
Demographic techniques form people-to-people
correlations

49
Demographic-based personalization
50
Demographic-based personalization
51
Demographic Methods (more sophisticated)

Demographic features in general are asked
But can also induced by classifying a user using
other user descriptions (e.g. the home page)
you need some user for which you know the class
(e.g. male/female)
Prediction can use whatever learning mechanism we
like (nearest neighbor, naïve classifier, etc.)

52
Core Recommendation Techniques
U is a set of users I is a set of items/products
53
Utility methods

A utility function is a map from a state onto a
real number, which describes the associated
degree of happiness
Can build a long term utility function but more
often the systems using such an approach try to
acquire a short term utility function
They must acquire the user utility function, or
the parameters defining such a function

54
Utility related information
55
Utility

The item is described by a list of attributes
(numerical) p1, pm, e.g., number of rooms,
square meters, levels, (MaxCost Cost),
It is generally assumed that higher values of the
attribute correspond to higher utilities
The user is modeled with a set of weights, u1, ,
um (in 0,1) on the same attributes
The objective is to find (retrieve) the products
with larger utility (maximal)
The problem is the elicitation or learning of
user model u1, , um

56
Utility and similarity

If the user has some preferred values for the
attributes, e.g, q1, , qm, one can substitute
to the value of the product a local similarity
function sim(qj, pj)
A typical local similarity function is (1-qj
pj/rangei), where rangei is the difference
between the max and min value of the attribute i.
A utility-based recommender become a similarity
maximization recommender (or a nearest neighbor
recommender).

57
Core Recommendation Techniques
U is a set of users I is a set of items/products
58
Knowledge Based Recommender

Suggests products based on inferences about a
users needs and preferences
Functional knowledge about how a particular item
meets a particular user need
The user model can be any knowledge structure
that supports this inference
A query
A case (in a case-based reasoning system)
An adapted similarity metric (for matching)
A part of an ontology
There is a large use of domain knowledge encoded
in a knowledge representation language/approach.

59
Hybrid Methods

Try to address the shortcomings of several
approaches, and produce recommendations using a
combination of those techniques
There is a large variability on these hybrid
methods there is no standard hybrid method
We shall present some of them here but many will
come later

60
Fab System

The user profile is based on content analysis
Two user profiles are compared to determine
similar users
Then a collaborative based recommendation is
generated
User receive items both when they score highly
against their own profile and when rated highly
by a user with a similar profile
This is a mixed approach

61
Fab Architecture

Collection agents find pages relevant for a
specific topic
Selection agent find pages for a specific user
Central router forwards pages on to those users
whose profiles they match above some threshold.

62
User Feedback

When the user has requested, received, and looked
over the recommendations, they are required to
assign a rating (7-point scale)
Ratings are store in the personal agent profile
Ratings are sent to the collection agents for
adapting the user profiles (stored by him)
Highly rated pages are sent to similar users
(collaborative)

63
Collaboration via Content

Problem addressed in a collaborative-based
recommender, products rated by pair of users may
be very few correlation between two users is
not reliable
In collaboration via content the content-based
profile of each user is exploited to detect
similarities among users.

64
Comparison

Contentbased recommendation is done with Bayes
classifier
Collaborative is standard using Pearson
correlation
Collaboration via content uses the content-based
user profiles

Averaged on 44 users Precision is computed in the
top 3 recommendations ( of plus in the
recommendation list)/3
65
Hybridization Methods
66
Weighted

The score of a recommended item is computed from
the results of all of the available
recommendation techniques present in the system
Example 1 a linear combination of recommendation
scores
Example 2 treats the output of each recommender
(collaborative, content-based and demographic) as
a set of votes, which are then combined in a
consensus scheme
The implicit assumption in this technique is that
the relative value of the different techniques is
more or less uniform across the space of possible
items
Not true in general e.g. a collaborative
recommender will be weaker for those items with a
small number of raters.

67
Switching

The system uses some criterion to switch between
recommendation techniques
Example The DailyLearner system uses a
content-collaborative hybrid in which a
content-based recommendation method is employed
first
If the content-based system cannot make a
recommendation with sufficient confidence, then a
collaborative recommendation is attempted
This switching hybrid does not completely avoid
the ramp-up problem, since both the collaborative
and the content-based systems have the new user
problem
The main problem of this technique is to identify
a GOOD switching condition.

68
Mixed

Recommendations from more than one technique are
presented together
The mixed hybrid avoids the new item start-up
problem
It does not get around the new user start-up
problem, since both the content and collaborative
methods need some data about user preferences to
start up.

69
Feature Combination

Achieves the content-collaborative merger
treating collaborative information (ratings of
users) as simply additional feature data
associated with each example and use
content-based techniques over this augmented data
set
The feature combination hybrid lets the system
consider collaborative data without relying on it
exclusively, so it reduces the sensitivity of the
system to the number of users who have rated an
item
The system have information about the inherent
similarity of items that are otherwise opaque to
a collaborative system.

70
Cascade

One recommendation technique is employed first to
produce a coarse ranking of candidates and a
second technique refines the recommendation from
among the candidate set
Example EntreeC uses its knowledge of
restaurants to make recommendations based on the
users stated interests. The recommendations are
placed in buckets of equal preference, and the
collaborative technique is employed to break ties
Cascading allows the system to avoid employing
the second, lower-priority, technique on items
that are already well-differentiated by the first
But requires a meaningful and constant ordering
of the techniques.

71
Feature Augmentation

Produce a rating or classification of an item and
that information is then incorporated into the
processing of the next recommendation technique
Example Libra system makes content-based
recommendations of books based on data found in
Amazon.com, using a naive Bayes text classifier
In the text data used by the system is included
related authors and related titles
information that Amazon generates using its
internal collaborative systems
Very similar to the feature combination method
Here the output of a recommender system is used
for a second RS
In feature combination the representations used
by two systems are combined.

72
Meta-Level

Using the model generated by one as the input for
another
Example FAB
user-specific selection agents perform
content-based filtering using Rocchios method to
maintain a term vector model that describes the
users area of interest
Collection agents, which gather new pages from
the web, use the models from all users in their
gathering operations.
Documents are first collected on the basis of
their interest to the community as a whole and
then distributed to particular users
Example collaboration via content the model
generated by the content-based approach is used
for representing the users in a collaborative
filtering approach.

73
Summary

Collaborative-based method is one of the most
popular
Suffers from bootstrapping problems
Content-based methods are well rooted in
information retrieval
Demographic methods are very simple and could
provide limited personalization (sometime can be
sufficient)
Utility-based methods go to the root of the
decision problem how to acquire the utility
function?
Hybrid methods are the most powerful and popular
right now there are plenty of options for
hybridization

74
New User Problem

Personalization systems can work only if some
information about the target user is available
No User Model No personalization
If the required user model data are not available
then try to arrive to them with a shortcut
Web of trust to identify similar users in a CF
system
Mining publications of a user and its similar
users help in defining the topics that are
interesting for a target user
The output is always a sort of similarity
function!!!

75
New Item Problem

Not much a problem for content-based filtering
it is the main advantage of content-based
filtering
Critical problem for user-user and item-item
collaborative filtering
Solution exploit similar items
Example NutKing/Dietorecs ranking based on
double similarity an implicit rating for a
similar item is considered to be assigned to the
target item

76
Recommendation and User Context

All RS should adapt to the context of search of
the user but some methods cannot cope with that
easily (e.g. CF)
It depends on the definition of context but in
practice this includes
Short term preferences (tomorrow I want )
Information related to the specific space-time
position of the user
Motivations of the search (a present to my
wife)
Circumstances (Ive some time to spend here)
Emotions (I feel adventure)
Availability of data

77
Privacy

Recommender systems are based on user information
There are laws that impose restrictions on the
usage and distribution of information about
people
RS must cope with these limitations e.g.
distributed recommender systems exchanging user
profiles could be impossible for legal reasons!
RS must be developed in such a way to limit the
possibility that an attacker could learn personal
data on some users
There is the need to develop techniques that
limit the number and type of personal data used
in a RS.

78
Robusteness

The recommender should be robust against attacks
aiming at modifying the system such that it will
recommend a product more often than others
Shilling
Nuking
Some algorithms may be more robust than others
Content based methods are not influenced at all
by false ratings

79
Challenges