Title: IPTV Recommender Systems
1IPTV Recommender Systems
2Agenda
- IPTV architecture
- Recommender algorithms
- Evaluation of different algorithms
3IPTV architecture
Customers
Service Provider
Network Provider
Content Provider
Head end
Set-top-box (decoder)
VOD
4IPTV architecture
- IPTV is a video service supplied by a telecom
service provider that owns the network
infrastructure and controls content distribution
over the broadband network for reliable delivery
to the consumer (generally to the TV/IP STB). - Services
- Broadcast TV (BTV) services which consist in the
simultaneous reception by the users of a
traditional TV channel, Free-to-air or Pay TV.
BTV services are usually implemented using IP
multicast protocols. - Video On Demand (VOD) services, which consist in
viewing multimedia contents made available by the
Service Provider, upon request. VOD services are
usually implemented using IP unicast protocols.
5IPTV Platform Now
CUSTOMERS FACE DIFFICULTIES FINDING THE RIGHT
CONTENT
HUNDREDS LIVE CHANNELS
THOUSANDS VOD ITEMS
CUSTOMER PURCHASES
CUSTOMER FRUSTRATION
6IPTV Platform with a recommender systems
From this.
Today recommendations, based on your personal
taste, are
To this.
7Recommender System how it works
USER DATA
USERS TASTE FRUTIONS AND RATINGS
CONTENT METADATA
RECOMMENDER SYSTEM
CONTENT RECOMMENDATIONS
8Benefits for the Provider
- Increased revenues
- Increase customer spontaneous purchases
- Attract new customers
- 35 of product sales result from recommendation
system(source Amazon.com) - Improved customer satisfaction and retention
- Personalize IPTV experience
- Strengthen customer ties
- Optimized business with insights into customer
tastes - Marketing and targeted advertising
- Planning of media content catalogs
9Benefits for the User
- Improved overall usability
- Reduced time to find the right content in VOD
catalogs - No more user frustration
- Personalized experience
- Recommendations tailored to user preferences
- Suggests unheard of content fitting customer
taste - Enhanced collaborative platform
- Create sense of community
- Involve users by collecting their feedback
(ratings) - Reduce the Channel Zapping problem
10Watch it VOD Recommendation
Recommend media contents based on customers
profile
Average Customers Rating
11Watch it Channel Recommendation
Recommend live channels based on customers
profile
12Watch it TV Program Recommendation
Watch the live event
Recommend TV programs based on customers profile
Schedule the recording of the recommended TV
program
13Watch it Media Content Rating
Collect customers explicit rating right after
content fruition
14Agenda
- IPTV architecture
- Recommender algorithms
- Evaluation of different algorithms
15Recommender systems overview
- Problem
- Users face with large amount of data, getting
confused in the retrieval process - Objectives
- Recommend users with just a list of relevant
items
information needs dynamism
- Good items
- All good items
Information Retrieval
Query - based
?
Profile - based
Information Filtering
information sources dynamims
16Problem formulation
Recommender
Users ratings
Items metadata
Ranked list
- Item1
- Item2
- Item3
- .
- .
- .
- ItemX
Top N
17Problem solutions
Recommender
Users ratings
Items metadata
CollaborativeFiltering
Content-based Filtering
18Recommendation techniques
Recommendation algorithm
Similar Items
Collaborative Filtering
Content-based Filtering
Users with similar taste
User based
Item based
19Collaborative Filtering
User-based similar users rate an item similarly
Item-based similar items are rated by a user
similarly
User-based similar users rate an item similarly
5
4
?
3
2
2
User
Item
Neighborhood
NB similarity means correlation
20Collaborative Filtering techniques
- For each user, compute a neighborhood by mean of
- Cosine between user vectors (in the items space)
- Pearson Similarity Coefficient
- ..
- Then, for each unrated item, compute its estimate
rating based on the rate given by users in the
neighbourhood. - Alternative and prominent techniques include SVD
user-rating matrix decomposition, bayesian
networks,
21Singular Value Decomposition
diagonal matrix
A
U
S
VT
U
S
VT
A
m x n
A
VkT
Ak
Sk
Uk
m x n
Svd complexity O(min(nm2, mn2))
22Collaborative Filtering SVD
svd
A
Ak
- Users in rows
- Items in columns
Vk sqrt (Sk)
Uk sqrt (Sk)
pseudo users
pseudo items
cosine
Ak
23Folding-in
- New rows/columns of A are projected (folded-in)
in the existing latent space without computing a
new SVD - e.g., a new user u
- u u Vk Sk-1
Ak
Uk
Sk
Vk
u
u
24Collaborative Filtering pro cons
- Pro
- There is no need for content
- Cons
- Cold Start we needs to have enough users in the
system to find a match. - Sparsity when the user/ratings matrix is sparse
it is hard to find a neighbourhood. - First Rater cannot recommend an item that has
not been previously rated anyone else - Popularity Bias cannot recommend items to
someone with unique tastes. Tends to recommend
popular items (dataset coverage)
25Content-based Filtering
Bag of Words (BOW) representation
- Similar items contain the same terms
- The more a term occurs in an item, the more
representative it is - The more a term occurs in the collection, the
less representative it is (i.e. it is less
important in order to distinguish a specific item)
Word
Item
26Content-based Filtering pro cons
- Pro
- No need for data on other users
- No cold-start or sparsity problems, neither
first-rater - Able to recommend to users with unique tastes
- Able to recommend new and unpopular items
- Can provide explanations about recommended items
- Well-known technology
- Cons
- Requires an structured content
- Low efficiency of BOW (Bag of words)
representation - Very high-dimensional
- Users tastes must be represented as a function of
the content features to be learnt - Unable to exploit quality judgments of other users
27Content-based Filtering techniques
User-item similarity
- Typically after a Latent Semantic Analysis
- Reduces space dimensionality
- Enhances BOW representation
28Content-based Filtering Latent Semantic Analysis
U
S
VT
U
S
VT
Word
Item
m x n
Latent dimension
Item
29Content-based Filtering Latent Semantic Analysis
svd
A
Ak
m x n
Vk sqrt (Sk)
- Terms in rows
- Items in columns
Uk sqrt (Sk)
pseudo terms
pseudo items
cosine
Ak
30Proposed model
1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0
0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 0
- Not all zeros mean user does not like an item
- We can change some zeros with a value from CbF
User ratings
Features extractions
CbF
Meta-data
SVD
Content-based Recommendation
CbfSimilarity
Ratings
31Proposed model
1 0.5 0 0.8 0 1 0 1 0 0 1 0 1 0 0 1 0 0.6 0 0 1 0
0 1 0 0.4 1 1 0 1 0 0 0 0.8 1 0 1 0 1 0.8 0 1 0
1 0 1 0.5 0 0 0 1 0.5 1 0 0 1 0 0 0.6 0 0 1 0 1
0.4 1 0 0 1 0
User ratings
Features extractions
CbF
Meta-data
SVD
Content-based Recommendation
CbfSimilarity
Ratings
CF
SVD
Mixed Recommendation
(Sorted list)Recommendations
32Proposed model
- The result is a modular and flexible model
- CF may use any algorithm
- Cbf may use any algorithm as well
Features extractions
CbF
Meta-data
SVD
Content-based Recommendation
CbfSimilarity
Ratings
CF
SVD
Mixed Recommendation
(Sorted list)Recommendations
33Agenda
- IPTV architecture
- Recommender algorithms
- Evaluation of different algorithms
34Recommender architecture
Resources management
Features extraction
Featuresrepresentation
Items
Storage
Filter
Compute user-item correlation
Items retrieval
Items recommendation
Users management
Infer and learn profile
Interests/tastes representation
Users
35System Architecture
Offline inputs
Real time calls
Content Data/Metadata
Web Services
IPTV Interface Layer H/A
(A)
(B)
Users Data
Plain Http
Offline 24/7 Processing engine
Real-time Recommendation Engine
Users behaviour Fruitions Ratings(?)
EJB
RecommenderRepository
The recommendation process is phased in two
steps (A) Step one is an offline process used
to analyze the rating data and generate a model
runs in background and updates the model on a
regular basis (B) Step two is a real-time online
process and uses the model built in step one to
respond in real-time to recommendation/personaliza
tion queries.
36Datasets
Real datasets composed by movies and user
fruitions, plus some extra information
- User-item rating matrix
- 23942 users
- 564 movies
- 56686 ratings
- Movie Meta-data (textual information)
- Title
- Genre
- Director
- Cast
- Duration
37Dataset users
of users
of ratings
38Dataset movies
of movies
of ratings
39System evaluation (1)
- Consider the user-rating matrix
- Take some positive ratings off (the items took
off are the sample items and the relative users
the sample users) - Run the algorithm and get a sorted list for the
sample users
sample
1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0
0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 0
1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0
0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 0
sample
users
users
items
items
40System evaluation (2)
- For each sample user i, the local metric is the
position pi of his sample item inside his own
sorted list - The more pi is close to 1, the better the
algorithm is performing for such user i. - For a dataset, the global metric
- Fix a threshold T (top T rated)
- Count the fraction of sample users with pi lt T
Position pi
user i
- itemC
- itemD
- itemA
- itemB
- itemH
1 0 0 0 0 1 0 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0
0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 0
1 0 0 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0
0 1 1 0 1 0 0 0 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 0
0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 0 1 0
users
users
items
items
41System evaluation (3)
- N sample users, each one with one sample item
- Fix a threshold T
- Define
- (pi piltT) ? N p(l)?1
- (pi piltT) ? 0 p(l)?0
- Leave-one-out
- repeat for all sample users and all sample items
- N non zero elements in the rating matrix
42User-based collaborative Latent size k
43User-based collaborative number of ratings
Real time parameter
44User-based collaborative number of ratings
45Hybrid Content Based Threshold
46Hybrid Content Based Threshold
47Evaluation
res1A
48Evaluation
res1B
49Evaluation
Res5/6
50Evaluation
res3A
51Evaluation
res3B
52Any questions
?