Title: Introduction to Recommender Systems
1Introduction to Recommender Systems
- Adapted from
- FRANCESCO RICCI
- eCommerce and Tourism Research Laboratory
- Automated Reasoning Systems Division
- ITC-irst
- Trento Italy
- ricci_at_itc.it
2What movie should I see?
The Internet Movie Database (IMDb) provides
information about actors, films, television
shows, television stars, video games and
production crew personnel. Owned by Amazon.com
since 1998, as of June 21, 2006 IMDb featured
796,328 titles and 2,127,371 people.
3What travel should I do ?
- I would like to escape from this ugly an tedious
work life and relax for two weeks in a sunny
place. I am fed up with these crowded and noisy
places just the sand and the sea and some
adventure.
- I would like to bring my wife and my children on
a holiday it should not be to expensive. I
prefer mountainous places not to far from home.
Children parks, easy paths and good cuisine are a
must.
- I want to experience the contact with a
completely different culture. I would like to be
fascinated by the people and learn to look at my
life in a totally different way.
4What book should I buy?
5What news should I read?
6What paper should I read ?
7A Solution
8Information Overload
- Internet information overload, i.e., the state
of having too much information to make a decision
or remain informed about a topic - Information retrieval technologies can assist a
user to locate content if the user knows exactly
what he is looking for (with some difficulties!) - The user must be able to say yes this is what I
need when presented with the right result - But in many information search task, e.g.,
product selection, the user is - not aware of the range of available options
- may not know what to search
- if presented with some results may not be able to
choose.
9Decisions
- When there are 100.000.000 of options it is
obvious we need tools for searching, filtering,
ranking options - But even when there are a few dozen of options we
need support - Examples
- Where to go for dinner tonight?
- What flight for going to London?
- What Digital SLR camera should I buy?
10Non-personalized tools
- A printed catalogue of products, e.g., of books
or clothes or travels - A shop window
- A car exhibition in a cars shop
- A movie finder data base (search by title or
actor) - A Podcast directory
-
11Personalized tools
- A printed catalogue of products, e.g., of books
or clothes or travels - THAT SHOWS ONLY THE PRODUCTS THAT YOULL LIKE TO
HAVE - A shop window
- THAT SHOWS IN THE BEST PLACES THE PRODUCTS YOURE
SEARCHING - A car exhibition in a cars shop
- THAT HAVE EXACTLY THE MODELS YOU THOUGH TO BUY
- A movie finder data base (search by title or
actor) - THAT DOES NOT SHOW THE MOVIES YOU HATE
- A Podcast directory
- THAT LISTS ONLY PODCASTS YOU LIKE
12Personalization
- Personalization is the ability to provide
content and services tailored to individuals
based on knowledge about their preferences and
behavior - Personalization is the capability to customize
customer communication based on knowledge
preferences and behaviors at the time of
interaction with the customer - Personalization is about building customer
loyalty by building a meaningful one-to-one
relationship by understanding the needs of each
individual and helping satisfy a goal that
efficiently and knowledgeably addresses each
individuals need in a given context
13Personalization Goals
- Provide the users with what they want or need
without requiring them to ask explicitly - Not a fully automated process
- Different approaches require different levels of
user involvement and human computer interactivity - Personalization techniques try to leverage all
available information about users to deliver a
personal experience
14Personalization
- Context
- pre-travel vs. during travel
- on-the-net, on-the-move, on-the-tour
- traveling, wandering and visiting
- environment in the hotel, train, car, airplane,
at the conference - Time/Space coordinates
- Business vs. Fun
User Knowledge, experience, budget, travel party,
cognitive capabilities, motivations (security,
variety, fun, etc.), age, language, disabilities
- Need
- Buy a complete travel package,
- choose a restaurant,
- collect information on a location,
- find the route,
- find the transportation mean
- communicate with other travelers
Device/Capabilities Personal computer, PDA, smart
phone, phone payment, movies, broadcast,
instant messaging, email, position, etc.
15Suppliers Motivations
- Making interactions faster and easier.
Personalization increases usability, i.e., how
well a web site allows people to achieve their
goals. - Increasing customer loyalty. A user should be
loyal to a web site which, when is visited,
recognizes the old customer and treats him as a
valuable visitor. - Increasing likelihood of repeated visits. The
longer the user interacts with the site, the
more refined his user model maintained by the
system becomes, and the more the web site can be
effectively customized to match user
preferences. - Maximize look-to-buy ratio. It turns out to be
look-to-book ratio in the travel and tourism
industry, which is actually the essential
indicator of personalization objectives in this
domain.
16Recommender Systems
- A recommender system helps to make choices
without sufficient personal experience of the
alternatives - To suggest products to their customers
- To provide consumers with information to help
them decide which products to purchase - They are based on a number of technologies
information filtering, machine learning, adaptive
and personalized system, user modeling,
17Examples
- Some examples found in the Web
- Amazon.com looks in the user past buying
history, and recommends product bought by a user
with similar buying behavior - Tripadvisor.com - Quoting product reviews of a
community of users - Activebuyersguide.com make questions about
searched benefits to reduce the number of
candidate products - Trip.com make questions and exploits to
constraint the search (exploit standardized
profiles) - Smarter Kids self selection of a user profile
classification of products in user profiles
18Core Recommendation Techniques
U is a set of users I is a set of items/products
19Evaluating Recommender Systems
- The majority focused on systems accuracy in
supporting the find good items users task - Assumption if a user could examine all items
available, he could place them in a ordering of
preference - Measure how well is the system in predicting the
exact rating value (value comparison) - Measure how well the system can predict whether
the item is relevant or not (relevant vs. not
relevant) - Measure how close the predicted ranking of items
is to the users true ranking (ordering
comparison)
20How Has Been Measured
- Split the available data (so you need to collect
data first!), i.e., the user-item ratings into
two sets training and test - Build a model on the training data
- For instance, in a nearest neighbor
(memory-based) CF simply put the ratings in the
training in a separate set - Compare the predicted rating on each test item
(user-item combination) with the actual rating
stored in the test set - You need a metric to compare the predicted and
true rating
21Accuracy Comparing Values
- Measure how close the recommender systems
predicted ratings are to the true user ratings
(for all the ratings in the test set). - Predictive accuracy (rating) Mean Absolute Error
(MAE), pi is the predicted rating and ri is the
true one - Variation 1 Mean Squared Error (take the square
of the differences), root mean squared error (and
then take the square root). These emphasize large
errors. - Variation 2 Normalized MAE MAE divided by the
range of possible ratings allowing comparing
results on different data sets, having different
rating scales.
22Precision and Recall
- Precision is the ratio of relevant items selected
by the recommender to the number of items
selected (Nrs/Ns) - Recall is the ratio of relevant items selected to
the number of relevant (Nrs/Nr) - Precision and recall are the most popular metrics
for evaluating information retrieval systems.
23Precision and Recall Example
- We assume to know the relevance of all the items
in the catalogue for a given user - The orange portion is that recommended by the
system
P4/70.57 R4/90.44
24Rank Accuracy Metrics
- Rank is the position in the sorted list
- Spearmans rank correlation (ui and vi are the
ranks of item i in the user and system order) - Kendalls Tau (C is the number of concordant
pairs pairs in the same order in both ranked
lists D is the number of discordant pairs, TR
tied pairs (same ranking) in the user order, TP
tied pair in the predicted order
25Core Recommendation Techniques
U is a set of users I is a set of items/products
26Movie Lens
http//movielens.umn.edu
27(No Transcript)
28(No Transcript)
29The CF Ingredients
- List of m Users and a list of n Items
- Each user has a list of items he/she expressed
their opinion about (can be a null set) - Explicit opinion - a rating score (numerical
scale) - Sometime the rating is implicitly purchase
records - Active user for whom the CF prediction task is
performed - A metric for measuring similarity between users
- A method for selecting a subset of neighbors for
prediction - A method for predicting a rating for items not
currently rated by the active user.
30Collaborative-Based Filtering
- The collaborative based filtering recommendation
techniques proceeds in these steps - For a target/active user (the user to whom a
recommendation has to be produced) the set of his
ratings is identified - The users more similar to the target/active user
(according to a similarity function) are
identified (neighborhood formation) - The products bought by these similar users are
identified - For each one of these products a prediction - of
the rating that would be given by the target user
to the product - is generated - Based on this predicted rating a set of top N
products are recommended.
31Nearest Neighbor Collaborative-Based Filtering
1
User Model interaction history
32Collaborative-Based Filtering
- A collection of user ui, i1, n and a collection
of products pj, j1, , m - A n m matrix of ratings vij , with vij ? if
user i did not rate product j - Prediction for user i and product j is computed
as
- Correlation factor can be computed by
- The sum (and average) are over j s.t. vij and vkj
are not ?
33Collaborative-Based Filtering
- Pros require minimal knowledge engineering
efforts (knowledge poor) - Users and products are symbols without any
internal structure or characteristics - Cons
- Requires a large number of explicit and reliable
rates to bootstrap - Requires products to be standardized (users
should have bought exactly the same product) - Assumes that prior behavior determines current
behavior without taking into account contextual
knowledge (session-level) - Does not provide information about products or
explanations for the recommendations - Does not support sequential decision making or
recommendation of good bundling, e.g., a travel
package.
34Problems of CF Sparsity
- Typically we have large product sets and user
ratings for a small percentage of them - Example Amazon millions of books and a user may
have bought hundreds of books - the probability that two users that have bought
100 books have a common book (in a catalogue of 1
million books) is 0.01 (with 50 and 10 millions
is 0.0002). - We must have a number of users comparable to one
tenth of the size of the product catalogue
35Problems of CF Scalability
- Nearest neighbor algorithms require computations
that grows with both the number of customers and
products - With millions of customers and products a
web-based recommender will suffer serious
scalability problems - The worst case complexity is O(mn) (m customers
and n products) - But in practice the complexity is O(m n) since
for each customer only a small number of products
are considered (one loop on the customers to
compute similarity and one on the products to
compute the prediction)
36Personalised vs Non-Personalised CF
- Collaborative-based recommendations are
personalized since the prediction is based on
the ratings (for a given item) expressed by
similar users - A non-personalized collaborative-based
recommendation can be generated by averaging the
recommendations of ALL the users - How the two approaches would compare?
37Personalised vs Non-Personalised CF
Not much difference indeed!
vij is the rating of user i for product j and vj
is the average rating for product j
38Core Recommendation Techniques
U is a set of users I is a set of items/products
39Content-Based Recommendation
- In content-based recommendations, the system
tries to recommend items similar to those a given
user has liked in the past - In contrast with collaborative recommendation
where the system identifies users whose tastes
are similar to those of the given user and
recommends items they have liked - A pure content-based recommender system makes
recommendations for a user based solely on the
profile built up by analyzing the content of
items which that user has rated in the past.
40Content-Based Recommender
- It is mainly used for recommending text-based
products (web pages, usenet news messages, ) - The items to recommend are described by their
associated features (e.g. keywords) - The User Model can be structured in a similar
way as the content for instance the
features/keywords more likely to occur in the
preferred documents (lazy approach) - Then, text documents can be recommended based on
a comparison between their content (words
appearing in the text) and a user model (a set of
preferred words) - The user model can also be a classifier based on
whatever technique (Neural Networks, Naïve Bayes,
C4.5, )
41Syskill Webert
- Assisting a person to find information that
satisfies long-term, recurring goals (e.g.
digital photography) - Feedbacks on the interestingness of a set of
previously visited sites is used to learn a
profile - The profile is used to predict interestingness of
unseen sites.
42Supported Interaction
- The user identifies a topic (e.g. Biomedical) and
a page with many links to other pages on the
selected topic (index page) - The user can then explore the Web with a browser
that in addition to showing a page - Offers a tool for collecting user ratings on
displayed pages - Suggests which links on the current page are
(estimated) interesting.
43Syskill Webert User Interface
The user indicated interest in
The user indicated no interest in
System Prediction
44Learning
- A document (HTML page) is described as a set of
Boolean features (a word is present or not) - They used a Bayesian classifier (for each user),
where the probability that a document w1v1, ,
wnvn (e.g. car1, story0, , price1) belongs
to a class (cold or hot) is - Both P(wj vjChot) (i.e., the probability that
in the set of the documents liked by a user the
word wjis present or not) and P(Chot) is
estimated from the training data - After training on 30/40 examples it can predict
hot/cold with an accuracy between 70 and 80
45Problems of Content-Based Recommenders
- A very shallow analysis of certain kinds of
content can be supplied - Some kind of items are not amenable to any
feature extraction methods with current
technologies (e.g. movies, music) - Even for texts (as Web pages), the IR techniques
cannot consider multimedia information, aesthetic
qualities, download time - Hence if you rate positively a page it could be
not related to the presence of certain keywords!
46Problems of Content-Based Recommenders
- Over-specialization the system can only
recommend items scoring high against a users
profile the user is recommended with items
similar to those already rated - Requires user feed-backs the pure content-based
approach (similarly to CF) requires user feedback
on items in order to provide meaningful
recommendations - It tends to recommend expected items this tends
to increase trust but could make the
recommendation not much useful (no serendipity) - Works better in those situations where the
products are generated dynamically (news,
email, events, etc.) and there is the need to
check if these items are relevant or not.
47Core Recommendation Techniques
U is a set of users I is a set of items/products
48Demographic Methods
- Aim to categorize the user based on personal
attributes and make recommendation based on
demographic classes - Demographic groups can come from marketing
research hence experts decided how to model the
users - Demographic techniques form people-to-people
correlations
49Demographic-based personalization
50Demographic-based personalization
51Demographic Methods (more sophisticated)
- Demographic features in general are asked
- But can also induced by classifying a user using
other user descriptions (e.g. the home page)
you need some user for which you know the class
(e.g. male/female) - Prediction can use whatever learning mechanism we
like (nearest neighbor, naïve classifier, etc.)
52Core Recommendation Techniques
U is a set of users I is a set of items/products
53Utility methods
- A utility function is a map from a state onto a
real number, which describes the associated
degree of happiness - Can build a long term utility function but more
often the systems using such an approach try to
acquire a short term utility function - They must acquire the user utility function, or
the parameters defining such a function
54Utility related information
55Utility
- The item is described by a list of attributes
(numerical) p1, pm, e.g., number of rooms,
square meters, levels, (MaxCost Cost), - It is generally assumed that higher values of the
attribute correspond to higher utilities - The user is modeled with a set of weights, u1, ,
um (in 0,1) on the same attributes - The objective is to find (retrieve) the products
with larger utility (maximal) - The problem is the elicitation or learning of
user model u1, , um
56Utility and similarity
- If the user has some preferred values for the
attributes, e.g, q1, , qm, one can substitute
to the value of the product a local similarity
function sim(qj, pj) - A typical local similarity function is (1-qj
pj/rangei), where rangei is the difference
between the max and min value of the attribute i. - A utility-based recommender become a similarity
maximization recommender (or a nearest neighbor
recommender).
57Core Recommendation Techniques
U is a set of users I is a set of items/products
58Knowledge Based Recommender
- Suggests products based on inferences about a
users needs and preferences - Functional knowledge about how a particular item
meets a particular user need - The user model can be any knowledge structure
that supports this inference - A query
- A case (in a case-based reasoning system)
- An adapted similarity metric (for matching)
- A part of an ontology
- There is a large use of domain knowledge encoded
in a knowledge representation language/approach.
59Hybrid Methods
- Try to address the shortcomings of several
approaches, and produce recommendations using a
combination of those techniques - There is a large variability on these hybrid
methods there is no standard hybrid method - We shall present some of them here but many will
come later
60Fab System
- The user profile is based on content analysis
- Two user profiles are compared to determine
similar users - Then a collaborative based recommendation is
generated - User receive items both when they score highly
against their own profile and when rated highly
by a user with a similar profile - This is a mixed approach
61Fab Architecture
- Collection agents find pages relevant for a
specific topic - Selection agent find pages for a specific user
- Central router forwards pages on to those users
whose profiles they match above some threshold.
62User Feedback
- When the user has requested, received, and looked
over the recommendations, they are required to
assign a rating (7-point scale) - Ratings are store in the personal agent profile
- Ratings are sent to the collection agents for
adapting the user profiles (stored by him) - Highly rated pages are sent to similar users
(collaborative)
63Collaboration via Content
- Problem addressed in a collaborative-based
recommender, products rated by pair of users may
be very few correlation between two users is
not reliable - In collaboration via content the content-based
profile of each user is exploited to detect
similarities among users.
64Comparison
- Contentbased recommendation is done with Bayes
classifier - Collaborative is standard using Pearson
correlation - Collaboration via content uses the content-based
user profiles
Averaged on 44 users Precision is computed in the
top 3 recommendations ( of plus in the
recommendation list)/3
65Hybridization Methods
66Weighted
- The score of a recommended item is computed from
the results of all of the available
recommendation techniques present in the system - Example 1 a linear combination of recommendation
scores - Example 2 treats the output of each recommender
(collaborative, content-based and demographic) as
a set of votes, which are then combined in a
consensus scheme - The implicit assumption in this technique is that
the relative value of the different techniques is
more or less uniform across the space of possible
items - Not true in general e.g. a collaborative
recommender will be weaker for those items with a
small number of raters.
67Switching
- The system uses some criterion to switch between
recommendation techniques - Example The DailyLearner system uses a
content-collaborative hybrid in which a
content-based recommendation method is employed
first - If the content-based system cannot make a
recommendation with sufficient confidence, then a
collaborative recommendation is attempted - This switching hybrid does not completely avoid
the ramp-up problem, since both the collaborative
and the content-based systems have the new user
problem - The main problem of this technique is to identify
a GOOD switching condition.
68Mixed
- Recommendations from more than one technique are
presented together - The mixed hybrid avoids the new item start-up
problem - It does not get around the new user start-up
problem, since both the content and collaborative
methods need some data about user preferences to
start up.
69Feature Combination
- Achieves the content-collaborative merger
treating collaborative information (ratings of
users) as simply additional feature data
associated with each example and use
content-based techniques over this augmented data
set - The feature combination hybrid lets the system
consider collaborative data without relying on it
exclusively, so it reduces the sensitivity of the
system to the number of users who have rated an
item - The system have information about the inherent
similarity of items that are otherwise opaque to
a collaborative system.
70Cascade
- One recommendation technique is employed first to
produce a coarse ranking of candidates and a
second technique refines the recommendation from
among the candidate set - Example EntreeC uses its knowledge of
restaurants to make recommendations based on the
users stated interests. The recommendations are
placed in buckets of equal preference, and the
collaborative technique is employed to break ties - Cascading allows the system to avoid employing
the second, lower-priority, technique on items
that are already well-differentiated by the first - But requires a meaningful and constant ordering
of the techniques.
71Feature Augmentation
- Produce a rating or classification of an item and
that information is then incorporated into the
processing of the next recommendation technique - Example Libra system makes content-based
recommendations of books based on data found in
Amazon.com, using a naive Bayes text classifier - In the text data used by the system is included
related authors and related titles
information that Amazon generates using its
internal collaborative systems - Very similar to the feature combination method
- Here the output of a recommender system is used
for a second RS - In feature combination the representations used
by two systems are combined.
72Meta-Level
- Using the model generated by one as the input for
another - Example FAB
- user-specific selection agents perform
content-based filtering using Rocchios method to
maintain a term vector model that describes the
users area of interest - Collection agents, which gather new pages from
the web, use the models from all users in their
gathering operations. - Documents are first collected on the basis of
their interest to the community as a whole and
then distributed to particular users - Example collaboration via content the model
generated by the content-based approach is used
for representing the users in a collaborative
filtering approach.
73Summary
- Collaborative-based method is one of the most
popular - Suffers from bootstrapping problems
- Content-based methods are well rooted in
information retrieval - Demographic methods are very simple and could
provide limited personalization (sometime can be
sufficient) - Utility-based methods go to the root of the
decision problem how to acquire the utility
function? - Hybrid methods are the most powerful and popular
right now there are plenty of options for
hybridization
74New User Problem
- Personalization systems can work only if some
information about the target user is available - No User Model No personalization
- If the required user model data are not available
then try to arrive to them with a shortcut - Web of trust to identify similar users in a CF
system - Mining publications of a user and its similar
users help in defining the topics that are
interesting for a target user - The output is always a sort of similarity
function!!!
75New Item Problem
- Not much a problem for content-based filtering
it is the main advantage of content-based
filtering - Critical problem for user-user and item-item
collaborative filtering - Solution exploit similar items
- Example NutKing/Dietorecs ranking based on
double similarity an implicit rating for a
similar item is considered to be assigned to the
target item
76Recommendation and User Context
- All RS should adapt to the context of search of
the user but some methods cannot cope with that
easily (e.g. CF) - It depends on the definition of context but in
practice this includes - Short term preferences (tomorrow I want )
- Information related to the specific space-time
position of the user - Motivations of the search (a present to my
wife) - Circumstances (Ive some time to spend here)
- Emotions (I feel adventure)
- Availability of data
77Privacy
- Recommender systems are based on user information
- There are laws that impose restrictions on the
usage and distribution of information about
people - RS must cope with these limitations e.g.
distributed recommender systems exchanging user
profiles could be impossible for legal reasons! - RS must be developed in such a way to limit the
possibility that an attacker could learn personal
data on some users - There is the need to develop techniques that
limit the number and type of personal data used
in a RS.
78Robusteness
- The recommender should be robust against attacks
aiming at modifying the system such that it will
recommend a product more often than others - Shilling
- Nuking
- Some algorithms may be more robust than others
- Content based methods are not influenced at all
by false ratings
79Challenges
- Generic user models (multiple products and tasks)
- Generic recommender systems (multiple products
and tasks) - Distributed recommender system (users and
products data are distributed) - Portable recommender systems (user data stored at
user side) - (user) Configurable recommender systems
- Multi strategy adapted to the user
- Privacy protecting RS
- Context dependent RS
- Emotional and values aware RS
- Trust and recommendations
- Persuasion technologies
- Easily deployable RS
- Group recommendations
80Challenges
- Interactive Recommendations sequential decision
making - Hybrid recommendation technologies
- Consumer Behavior and Recommender Systems
- Complex Products recommendations
- Mobile Recommendations
- Business Models for Recommender Systems
- High risk and value recommender systems
- Recommendation and negotiation
- Recommendation and information search
- Recommendation and configuration
- Listening customers
- Recommender systems and ontologies
81Thank you for your patience!
82(No Transcript)
83(No Transcript)
84(No Transcript)
85Trip.com
86Trip.com
87Trip.com
88(No Transcript)