Recommender Systems Session B - PowerPoint PPT Presentation

About This Presentation
Title:

Recommender Systems Session B

Description:

Title: Recommender Systems Author: CTI Last modified by: Robin Burke Created Date: 11/21/2006 4:21:39 PM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:303
Avg rating:3.0/5.0
Slides: 68
Provided by: CTI79
Category:

less

Transcript and Presenter's Notes

Title: Recommender Systems Session B


1
Recommender SystemsSession B
  • Robin Burke
  • DePaul University
  • Chicago, IL

2
Roadmap
  • Session A Basic Techniques I
  • Introduction
  • Knowledge Sources
  • Recommendation Types
  • Collaborative Recommendation
  • Session B Basic Techniques II
  • Content-based Recommendation
  • Knowledge-based Recommendation
  • Session C Domains and Implementation I
  • Recommendation domains
  • Example Implementation
  • Lab I
  • Session D Evaluation I
  • Evaluation
  • Session E Applications
  • User Interaction
  • Web Personalization
  • Session F Implementation II
  • Lab II

3
Content-Based Recommendation
  • Collaborative recommendation
  • requires only ratings
  • Content-based recommendation
  • all techniques that use properties of the items
    themselves
  • usually refers to techniques that only use item
    features
  • Knowledge-based recommendation
  • a sub-type of content-based
  • in which we apply knowledge
  • about items and how they satisfy user needs

4
Content-Based Profiling
  • Suppose we have no other users
  • but we know about the features of the items rated
    by the user
  • We can imagine building a profile based on user
    preferences
  • here are the kinds of things the user likes
  • here are the ones he doesn't like
  • Usually called content-based recommendation

5
Recommendation Knowledge Sources Taxonomy
RecommendationKnowledge
Collaborative
Opinion Profiles
Demographic Profiles
User
Opinions
Query
Demographics
Constraints
Requirements
Preferences
Content
Item Features
Context
Means-ends
DomainKnowledge
FeatureOntology
Contextual Knowledge
DomainConstraints
6
Content-based Profiling
To find relevant items
? item a1 a2 a3 a4 ... ak
Recommend
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
Obtain rated items
Build classifier
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
Classifier
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
Predict
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
Y
N
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
? item a1 a2 a3 a4 ... ak
7
Origins
  • Began with earliest forms of user models
  • Grundy (Rich, 1979)
  • Elaborated in information filtering
  • Selecting news articles (Dumais, 1990)
  • More recently spam filtering

8
Basic Idea
  • Record user ratings for item
  • Generate a model of user preferences over
    features
  • Give as recommendations other items with similar
    content

9
Movie Recommendation
  • Predictions for unseen (target) items are
    computed based on their similarity (in terms of
    content) to items in the user profile.
  • E.g., user profile Pu contains
  • recommend highly and recommend
    mildly

10
Content-Based Recommender Systems
11
Personalized Search
  • How can the search engine determine the users
    context?

?
Query Madonna and Child
?
  • Need to learn the user profile
  • User is an art historian?
  • User is a pop music fan?

12
Play List Generation
  • Music recommendations
  • Configuration problem
  • Must take into account other items already in
    list

Example Pandora
13
Algorithms
  • kNN
  • Naive Bayes
  • Neural networks
  • Any classification technique can be used

14
Naive Bayes
  • p(A) probability of event A
  • p(A,B) probability of event A and event B
  • joint probability
  • p(AB) probability of event A given event B
  • we know B happened
  • conditional probability
  • Example
  • A is a student getting an "A" grade
  • p(A) 20
  • B is the event of a student coming to less than
    50 of meetings
  • p(AB) is much less than 20
  • p(A,B) would be the probability of both things
  • how many students are in this category?
  • Recommender system question
  • Li is the event that the user likes item i
  • B is the set of features associated with item i
  • Estimate p(LiB)

15
Bayes Rule
  • p(AB) p(BA) p(A) / p(B)
  • We can always restate a conditional probability
    in terms of
  • the reverse condition p(BA)
  • and two prior probabilities
  • p(A)
  • p(B)
  • Often the reverse condition is easier to know
  • we can count how often a feature appears in items
    the user liked
  • frequentist assumption

16
Naive Bayes
  • Probability of liking an item given its features
  • p(Lia1, a2, ... , ak)
  • think of Li as the class for item i
  • By the theorem

17
Naive Assumption
  • Independence
  • the features a1, a2, ... , ak are independent
  • independent means
  • p(A,B) p(A)p(B)
  • Example
  • two coin flips P(heads) 0.5
  • P(heads,heads) 0.25
  • Anti-example
  • appearance of the word "Recommendation" and
    "Collaborative" in papers by Robin Burke
  • P("Recommendation") 0.6
  • P("Collaborative") 0.3
  • P("Recommendation","Collaborative")0.3 not 0.18
  • In general
  • this assumption is false for items and their
    features
  • but pretending it is true works well

18
Naive Assumption
  • For joint probability
  • For conditional probability
  • Bayes' Rule

19
Frequency Table
  • Iterate through all examples
  • if example is "liked"
  • for each feature a
  • add one to the cell for that feature under L
  • similar for L

L L
a1
a2
...
ak
20
Example
  • Total of movies 20
  • 10 liked
  • 10 not liked

21
Classification MAP
  • Maximum a posteriori
  • Calculate the probabilities for each possible
    classification
  • pick the one with the highest probability
  • Examples
  • "12 Monkeys" Pitt Willis
  • p(L12 Monkeys)0.13
  • p(L12 Monkeys)1
  • not liked
  • "Devil's Own" Ford Pitt
  • p(LDevil's Own)0.67
  • p(LDevil's Own)0.53
  • liked

22
Classification LL
  • Log likelihood
  • For two possibilities
  • Calculate probabilities
  • Compute ln(p(Lia1, ... , ak)/p(Lia1, ... , ak)
  • If gt 0, then classify as liked
  • Examples
  • "12 Monkeys" Pitt Willis
  • ratio 0.13
  • ln -2.1
  • not liked
  • "Devil's Own" Ford Pitt
  • p(LDevil's Own)0.67
  • p(LDevil's Own)0.53
  • ratio 1.25
  • ln 0.22
  • liked

23
Smoothing
  • If a feature never appears in a class
  • p(ajL)0
  • that means that it will always veto the
    classification
  • Example
  • new movie director
  • cannot be classified as "liked"
  • because there are no liked instances in which he
    is a feature
  • Solution
  • Laplace smoothing
  • add a small random value to all attributes before
    starting

24
Naive Bayes
  • Works surprisingly well
  • used in spam filtering
  • Simple implementation
  • just counting and multiplying
  • requires O(F) space
  • where F is the feature set used
  • easy to update the profile
  • classification is very fast
  • Learned classifier can be hard-coded
  • used in voice recognition and computer games
  • Try this first

25
Neural Networks
26
Biological inspiration
dendrites
axon
synapses
The information transmission happens at the
synapses.
27
How it works
  • Source (pre-synaptic)
  • Tiny voltage spikes travel along the axon
  • At dendrites, neurotransmitter released in the
    synapse
  • Destination (post-synaptic)
  • Neurotransmiter absorbed by dendrites
  • Causes excitation or inhibition
  • Signals integrated
  • may produce spikes in the next neuron
  • Connections
  • Synaptic connections can be strong or weak

28
Artificial neurons
Neurons work by processing information. They
receive and provide information in form of
voltage spikes.
x1 x2 x3 xn-1 xn
w1
Output
w2
Inputs
y
w3
.
.
.
wn-1
wn
The McCullogh-Pitts model
29
Artificial neurons
Nonlinear generalization of the McCullogh-Pitts
neuron
y is the neurons output, x is the vector of
inputs, and w is the vector of synaptic
weights. Examples
sigmoidal neuron Gaussian neuron
30
Artificial neural networks
Output
Inputs
An artificial neural network is composed of many
artificial neurons that are linked together
according to a specific network architecture. The
objective of the neural network is to transform
the inputs into meaningful outputs.
31
Learning with Back-Propagation
  • Biological system
  • seems to modify many synaptic connections
    simultaneously
  • we still don't totally understand this
  • A simplification of the learning problem
  • calculate first the changes for the synaptic
    weights of the output neuron
  • calculate the changes backward starting from
    layer p-1, and propagate backward the local error
    terms
  • Still relatively complicated
  • much simpler than the original optimization
    problem

32
Application to Recommender Systems
  • Inputs
  • features of products
  • binary features work best
  • otherwise tricky encoding is required
  • Output
  • liked / disliked neurons

33
NN Recommender
Item Features
Liked
Disliked

  • Calculate recommendation score as yliked -
    ydisliked

34
Issues with ANN
  • Often many iterations are needed
  • 1000s or even millions
  • Overfitting can be a serious problem
  • No way to diagnose or debug the network
  • must relearn
  • Designing the network is an art
  • input and output coding
  • layering
  • often learning simply fails
  • system never converges
  • Stability vs plasticity
  • Learning is usually one-shot
  • Cannot easily restart learning with new data
  • (Actually many learning techniques have this
    problem)

35
Overfitting
  • The problem of training a learner too much
  • the learner continues to improve on the training
    data
  • but gets worse on the real task

36
Other classification techniques
  • Lots of other classification techniques have been
    applied to this problem
  • support vector machines
  • fuzzy sets
  • decision trees
  • Essentials are the same
  • learn a decision rule over the item features
  • apply the rule to new items

37
Content-Based Recommendation
  • Advantages
  • useful for large information-based sites (e.g.,
    portals) or for domains where items have
    content-rich features
  • can be easily integrated with content servers
  • Disadvantages
  • may miss important pragmatic relationships among
    items (based on usage)
  • avante-garde jazz / classical
  • not effective in small-specific sites or sites
    which are not content-oriented
  • cannot achieve serendipity novel connections

38
Break
  • 10 minutes

39
Roadmap
  • Session A Basic Techniques I
  • Introduction
  • Knowledge Sources
  • Recommendation Types
  • Collaborative Recommendation
  • Session B Basic Techniques II
  • Content-based Recommendation
  • Knowledge-based Recommendation
  • Session C Domains and Implementation I
  • Recommendation domains
  • Example Implementation
  • Lab I
  • Session D Evaluation I
  • Evaluation
  • Session E Applications
  • User Interaction
  • Web Personalization
  • Session F Implementation II
  • Lab II

40
Knowledge-Based Recommendation
  • Sub-type of content-based
  • we use the features of the items
  • Covers other kinds of knowledge, too
  • means-ends knowledge
  • how products satisfy user needs
  • ontological knowledge
  • what counts as similar in the product domain
  • constraints
  • what is possible in the domain and why

41
Recommendation Knowledge Sources Taxonomy
RecommendationKnowledge
Collaborative
Opinion Profiles
Demographic Profiles
User
Opinions
Query
Demographics
Constraints
Requirements
Preferences
Content
Item Features
Context
Means-ends
DomainKnowledge
FeatureOntology
Contextual Knowledge
DomainConstraints
42
Diverse Possibilities
  • Utility
  • some systems concentrate on representing the
    user's constraints in the form utility functions
  • Similarity
  • some systems focus on detailed knowledge-based
    similarity calculations
  • Interactivity
  • some systems use knowledge to enhance the
    collection of requirement information
  • For our purposes
  • concentrate on case-based recommendation and
    constraint-based recommendation

43
Case-Based Recommendation
  • Based on ideas from case-based reasoning (CBR)
  • An alternative to rule-based problem-solving
  • A case-based reasoner solves new problems by
    adapting solutions used to solve old problems
  • -- Riesbeck Schank 1987

44
CBR Solving Problems
Review
Retain
Database
Adapt
Retrieve
Similar
New Problem
45
CBR System Components
  • Case-base
  • database of previous cases (experience)
  • episodic memory
  • Retrieval of relevant cases
  • index for cases in library
  • matching most similar case(s)
  • retrieving the solution(s) from these case(s)
  • Adaptation of solution
  • alter the retrieved solution(s) to reflect
    differences between new case and retrieved case(s)

46
Retrieval knowledge
  • Contents
  • features used to index cases
  • relative importance of features
  • what counts as similar
  • Issues
  • surface vs deep similarity

47
Analogy to the catalog
  • Problem
  • user need
  • Case
  • product
  • Retrieval
  • recommendation

48
Entree I
49
Entree II
50
Entree III
51
Critiquing Dialog
  • Mixed-initiative interaction
  • user offers input
  • system responds with possibilities
  • user critiques or offers additional input
  • Makes preference elicitation gradual
  • rather than all-at-once with a query
  • can guide user away from empty parts of the
    product space

52
CBR retrieval
  • Knowledge-based nearest-neighbor
  • similarity metric defines distance between cases
  • usually on an attribute-by-attribute basis
  • Entree
  • cuisine
  • quality
  • price
  • atmosphere

53
How do we measure similarity?
  • complex multi-level comparison
  • goal sensitive
  • multiple goals
  • retrieval strategies
  • non-similarity relationships
  • Can be strictly numeric
  • weighted sum of similarities of features
  • local similarities
  • May involve inference
  • reasoning about the similarity of items

54
Price metric
55
Cuisine Metric
European
Asian
French
Chinese
Japanese
NouvelleCuisine
Vietnamese
Thai
PacificNew Wave
56
Metrics
  • Goal-specific comparison
  • How similar is target product to the source with
    respect to this goal?
  • Asymmetric
  • directional effects
  • A small of general purpose types

57
Metrics
  • If they generate a true metric space
  • approaches using space-partitioning techniques
  • bsp, quad-trees, etc.
  • Not always the case
  • Hard to optimize
  • storing n2 distances/recalculating
  • FindMe calculates similarity at retrieval time

58
Combining metrics
  • Global metric
  • combination of attribute metrics
  • Hierarchical combination
  • lower metrics break ties in upper
  • Benefits
  • simple to acquire
  • easy to understand
  • Somewhat inflexible
  • More typical would be a weighted sum

59
Constraint-based Recommendation
  • Represent users needs as a set of constraints
  • Try to satisfy those constraints with products

60
Example
  • User needs a car
  • Gas mileage gt 25 mpg
  • Capacity gt 5 people
  • Price lt 18,000
  • A solution would be a list of models satisfying
    these requirements

61
Configurable Products
  • Constraints important where products are
    configurable
  • computers
  • travel packages
  • business services
  • (cars)
  • The relationships between configurable components
    need to be expressed as constraints anyway
  • a GT 6800 graphics card needs power supply gt 300
    W

62
Product Space
Weight lt x
Screen gt y
Weight
PossibleRecommendations
Screen Size
63
Utility
  • In order to rank products
  • we need a measure of utility
  • can be slack
  • how much the product exceeds the constraints
  • can be another measure
  • price is typical
  • can be a utility calculation that is a function
    of product attributes
  • but generally this is user-specific
  • value of weight vs screen size

64
Product Space
Weight lt x
Screen gt y
Weight
C
A
B
Screen Size
65
Utility
  • SlackA (X WeightA) (SizeA - Y)
  • not really commensurate
  • PriceA
  • ignores product differences
  • UtilityA ? (X WeightA) ? (SizeA - Y ) ?
    (X WeightA) (SizeA - Y )
  • usually we ignore ? and treat utilities as
    independent
  • how do we know what ? and ? are?
  • make assumptions
  • infer from user behavior

66
Knowledge-Based Recommendation
  • Hard to generalize
  • Advantages
  • no cold start issues
  • great precision possible
  • very important in some domains
  • Disadvantages
  • knowledge engineering required
  • can be substantial
  • expert opinion may not match user preferences

67
Next
  • Session C
  • 1500
  • Need laptops
  • Install workspace
Write a Comment
User Comments (0)
About PowerShow.com