Title: ??a? ?
1?T???? ??? ??????S?????? ??????S?????
?T??O? S???? T????O? ???S???O? ????? ?????F?????S
??? ??????????O??O? ???G????? ???????????O?
S????O? ????O?????? ??G?S??
- ??a? ?ß??d???? a??????µ?? ??a s?st?se?? ta?????
se - s?st?µata d?ad?ast???? t??e??as??.
- G?a??a??p????? ?a?a???ta - Spa?? ?????s?a
2System architecture- Block diagram
Server
Subscriber Profiles/ Ratings
Media Content System
3System architecture Server side
- Streaming Server ? open source Darwin Streaming
Server t?? Apple pa???e? VoD ?a? Live Streaming
?p??es?e? 1. - Media Content System T? s?st?µa st? ?p???
ap????e???ta? ta a??e?a video st?? streaming
server. - Media Content Management System ???sd?de? ??a?
µ??ad??? ??d??? se ???e ??a ta???a p?? e?s??eta?
st? media content system, t? CRID (Content
Reference IDentifier). - Subscribers Profiles/ Ratings ??s? ded?µ????
p?? pe????e? ta p??s?p??? st???e?a t?? ???st?? - ???µa, ep??et?, username, password, ?????a, f???,
ep???e?µa ?a? ??a s????? ap? e?d? ta????? p??
a??s??? st? ???st?. - ?a? t?? ßa?µ?????e? t???
- ?? ßa?µ?????e? d????ta? e?te ?µµesa e?te ?µesa
ap? t??? ???ste?. - ?p?te?e?ta? ap? t??? p??a?e? users, profiles,
ratings ?a? cbratings.
4System architecture Server side
- Profiles Control System S?st?µa a?a?????s?? t??
ta?t?t?ta? t?? ???st? (user authentication/sign
in) ?a? e???af?? t?? (registration). - Web Server Apache Tomcat web server 2
- Semantics Component ??µµ?t? t?? s?st?µat?? p??
ep???????e? µe e??te????? p???? ?a? a?t?e?
p????f???e? ??a t?? ta???e?. ? p??? p????f?????
e??a? t? IMDB 3. ?a ded?µ??a p?? a?t?e? ??a
???e ta???a e??a? - ? t?t???, ta e?d? t?? (genres), ? p????, ?
p??ta????st??, t? ?t?? pa?a?????, ? s??????t??, ?
pa?a?????, ? se?a??????f?? ?a? ? ?µ????µe??
???ssa.
5System architecture Server side
- Data Set MovieLens dataset 4. ???s?µ?p??????e
??a ta pe???µata ?a? t?? a???????s? t??
s?st?µat??. ?p?te?e?ta? ap? 100K ßa?µ?????e? p??
d????a? ap? 943 ???ste? ??a 1682 ta???e?. ??
dataset ap?te?e?ta? ap? 3 p??a?e? - users (UserIDGenderAgeOccupationZipCode),
- movies (MovieIDTitleGenres) and
- ratings(UserIDMovieIDRatingTimestamp)
- Recommendation System S??d?asµ?? d?? a??????µ??
se ??a? ?ß??d??? p?? p??te??e? ta???e? st???
???ste?. ? s??d?asµ?? a??????µ?? ??e? ap?de???e?
?t? d??e? ße?t??µ??a ap?te??sµata se s??s? µe
a?t? p?? ?a ?d??e ???e a??????µ?? ?e????st?. ??
a??????µ?? p?? s??d??st??a? e??a? ??
collaborative filtering ?a? content based.
?p?p???? µe t?? s?st?se?? ta????? pa???eta? st?
???st? ?a? epe????s? ??a t? s?stas?.
6System architecture Client side
- Proxy Server G?a ???? t?? EPG data ap? t? site
t?? BBC e??atast????e ? Jetty 5 open source
web server. - Web Browser Mozilla Firefox 3.0.5 web browser
6. - Media Player G?a a?apa?a???? video e?s?µat????e
st? web browser ? QuickTime Player Pro 7.5 7. ?
client e????e? t?? media player µe t? QTJava API
8. - BBC ?a???e? t? p????aµµa t?? BBC TV ?a? Radio
??a 7 ?µ??e? 9 se TvAnytime format 10,11.
7Content based
- ???p??????e ??a? bag-of-words naive Bayesian text
classifier p?? - epe?t????e ?a ?e????eta? d?a??sµata ap? s?????
???e?? 12. - ? ape?????s? t?? ta????? d?a?????eta? sta ped?a
(slots) p??ta????st??, e?d?? ta???a?,
s??????t??, pa?a?????, s????af?a?, ?????????a
pa?a?????, ???ssa ta???a?. - ??a ta???a µp??e? ?a ??e? pe??ss?te?e? ap? µ?a
t?µ?? se ???sµ??a slots ? (bag-of-words).
??t?p??s?pe?eta? s??????? ap? ??a d????sµa ap?
bag-of-words. - ???ß??µa p??a??t???? ?at??????p???s?? 5 ???se?? ?
?p??????eta? ? p??a??t?ta µ?a? ta???a? ?a ??ße?
ßa?µ?????a ap? 1 ??? 5.
8Content based
- F?s? e?pa?de?s?? (??µ???s? p??f?? ???st??)
- ??µ??????a t?? bags-of-words ???? t?? slots (??e?
?? d??at?? ???e?? t?? slot). - G?a ???e ???st?
- ???es? s?????? ta????? N p?? ??e? ßa?µ?????se?
(training movies ). - ???es? t?? p??a??t?ta? ???e ???s??
-
- ??a ?p??
- ???es? t?? ?p?-s?????? p??a??t?t?? ???e ?????
t?? slot, - ded?µ???? ?t? a???e? st?? ???s?
?a? st? slot -
-
- ?p??
, -
-
p????? ???e?? s???? t?? m slot t?? ta???a?, -
f???? eµf???s?? t?? ????? t?? ta???a?
st? slot -
- Smoothing t?? µ?de????? p??a??t?t?? ???e??.
9Content based
- F?s? p??ß?e??? (???ß?e?? ßa?µ???????)
- G?a ??e? t?? ta???e? t?? ß?s??
- ???es? t?? posterior p??a??t?t?? ??a t?? ta???a ?
??a ??e? t?? ???se?? µe ß?s? t?? ?a???a t??
Bayes. - ????es? t?? ta???a? ? st?? ???s? c p??
?p?????st??e ? µe?a??te?? p??a??t?ta (rating
c). - ??µ??????a p??a?a user-ratings.
-
10Content based - p??ß??µata
- ?e?????sµ??? a????s? pe??e??µ???? ?e????????ta?
ap? ta ?a?a?t???st??? ?????sµata p?? e??a?
s??deµ??a µe ta a?t??e?µe?a. ?? d?? d?af??et???
a?t??e?µe?a a?t?p??s?pe???ta? ap? t? ?d?? s?????
?a?a?t???st???? ?????sµ?t??, e??a? ?µ??a. - ?pe?e?d??e?s? ?p????? ?a s?st????? µ???
a?t??e?µe?a p?? s?µe?????? ????? s??? se s??s? µe
t? p??f?? t?? ???st? ? ? ???st?? pe??????eta? st?
s?stas? a?t??e?µ???? pa??µ??a µe e?e??a p?? ??e?
?d? ßa?µ?????se?. - Cold-Start Problem ? ???st?? p??pe? ?a
ßa?µ?????se? ??a? ??a??p???t??? a???µ?
a?t??e?µ???? ?ste ?a ?ata??????? ?? p??t?µ?se??
t?? ???st?.
11Collaborative filtering
- ?fa?µ?st??e ? Collaborative filtering with
cluster based smoothing 13. - ???a? ??a? memory based Pearson correlation
coefficient - a??????µ??. S????? a??????µ??
- Clustering ?µad?p???s? t?? ???st?? t?? dataset
se ? ?µ?de? µe k-means 14 ?a? ???t???? t?
s????t?s? s?s??t?s?? Pearson - ? k-means te?µat??e? ?ta? e?a??st?p????e? ?
a???µ?? t?? ???st?? p?? a??????? cluster a??µesa
se d?? epa?a???e??. - Data smoothing S?µp???????ta? ?? ßa?µ?????e?
???e ???st? ??a t?? ta???e? t?? dataset p?? de?
??e? ßa?µ?????se?. ?? ??e? ßa?µ?????e? ßas????ta?
st?? ßa?µ?????e? p?? ????? d?se? ?? ?p????p??
???ste? t?? cluster t??.
12Collaborative filtering
- ???ep????? t?? ?e?t???? ??ta?? ???e ???? ???st?
se ??a ap? ta clusters. ?????? ?p??????eta? ?
s?s??t?s? t?? µe ??a ta centroids. ?e????
t?p??ete?ta? st? cluster µe t? ?p??? ??e? t?
µe?a??te?? s?s??t?s?. - ?p????? t?? ???t???te??? ?e?t???? ?? s????? t??
???t???te??? ?e?t???? e??a? ?? ? ???ste? t??
cluster µe t??? ?p????? ? ???? ???st?? ??e? t?
µe?a??te?? ?µ???t?ta. - ???ß?e?? ßa?µ?????a? ???ß?e?? t?? ßa?µ?????a?
t?? ???? ???st? ??a ??e? t?? ta???e? t?? dataset
µe ß?s? t?? ßa?µ?????e? t?? ???t???te??? ?e?t????.
13Collaborative filtering - p??ß??µata
- Cold-Start Problem ??µ?????e?ta? ap? t??? ?????
???ste? st? s?st?µa p?? de? ????? ßa?µ?????se?
?a???a a?t??e?µe?? ? t? s?st?µa ad??ate? ?a ß?e?
pa??µ????? ???ste?-?e?t??e? ?a? de? µp??e? ?a
???e? p??ß???e??. - First-Rater Problem ??µ?????e?ta? ap? ta ??a
a?t??e?µe?a st? s?st?µa p?? de? ?????
ßa?µ??????e? ap? ?a???a ???st?. ? CF e?a?t?ta?
ap???e?st??? ap? t? ßa?µ?????a t?? ????? ???st??
? t? s?st?µa de? µp??e? ?a d?µ??????se? p??t?se??
st??? ???ste? ??a a?t? ta a?t??e?µe?a µ???? ?a
??ß??? ??a? ??a??p???t??? a???µ? ßa?µ???????. - Gray sheep Problem ? ???st?? s?µp?pte? µe ta
s????a µeta?? t?? ?pa????t?? ???se?? t?? ???st??
? de? µp??e? ?a ta????µ??e? se ?aµ?a ???s?! - Data Sparsity ?fe??eta? st? ?e????? ?t? ??
???ste? ßa?µ??????? µ??? ??a µ???? a???µ?
a?t??e?µ????.
14Hybrid approach Which are the problems to solve?
- ???ß??µa t?? content based e??a? ?t? p??te??e?
µ??? pa??µ??e? ta???e? µe a?t?? p?? ? ???st??
??e? ?d? de?. ?p?s?? p?s?e? ap? t? cold start
p??ß??µa. - ?????? t?? collaborative filtering e??a? t? cold
start, t? data sparsity ?a? t? gray sheep
p??ß??µa. - ??? ?a t??? s??d??s??µe ?ste ? ??a? ?a
?atast???e? ta p??ß??µata t?? ???????
15Hybrid approach In which order ?
- S?µf??a µe t?? Burke 15 t? ap?d?t???te?? s??µa
e??a? t? se???a?? (cascade hybrid). - ? p??t?? a??????µ?? d??e? t?? p??ß???e?? t?? se
µ?a se??? ap? d?ateta?µ??e? ???se?? ?0-?n 16.
St?? ?d?a ???s? ta????µ???ta? ?? ta???e? p??
e??a? t? ?d?? ?a??? p??t?se??. - O de?te??? a??????µ?? ße?t???e? ta ap?te??sµata
t?? p??t??. - ?e p??a se??? ?a s??d??s??µe t??? d??
a??????µ????
16Hybrid approach In which order ?
- ?st? ?t? efa?µ????µe p??ta t?? collaborative
filtering. Ta ????µe ?a a?t?µet?p?s??µe - Data sparsity ?a?
- Cold start
- ??? a? efa?µ?s??µe p??ta t?? content based
- Cold start
- ??at? ? content based e?a?e?fe? t? data
sparsity!! - ?p?p???? t?p??et?saµe st? f??µa e???af?? t??
???st? µ?a ??sta ap? e?d? ta????? p?? µp??e? ?a
ßa?µ?????se?. ?ts? a?t?µet?p?saµe ?a? t? cold
start p??ß??µa.
17Explanations Why?
- G?a ?a ??????e? ? ???st?? t? ???? p?? t??
p??t????e ? ta???a. - G?a ?a eµp?ste?te? t?? p??t?se?? t?? s?st?µat??.
- G?a ?a ?ata??ße? ?a??te?a t?? t??p? p??
?e?t????e? t? s?st?µa. - G?a ?a µ? ???e? t? ????? t?? µe ta???e? p?? de?
?a t?? a??s???.
18Explanations Content based
- ???a? ta features t?? ta???a? p?? t?? ?a??st???
pa??µ??a µe t?? ???e? ta???e? p?? ? ???st?? ??e?
a???????se? ?et???. - G?a t?? ta???a p?? p??te??e ? a??????µ??
s????????ta? ??e? ?? ?p?-s?????? p??a??t?te? t??
???e?? t?? ta???a?. - ?? ???e slot t?? ta???a? µp??e? ?a ??e?
pe??ss?te?e? ap? µ?a t?µ?? (bags-of-words). - ?? feature t?? slot µe t?µ? t? µe?a??te??
?p?-s?????? p??a??t?ta e??a? t? ?s???? feature
t?? ta???a? (µe t? µe?a??te?? d?a????st???
??a??t?ta) ?a? ap?te?e? t? ???? ??a t?? ?p???
p??t????e ? s???e???µ??? ta???a.
19Explanations Collaborative Filtering
- ???a? t? p?s?st? t?? ???t???te??? ?e?t???? p??
????? a???????se? ?et??? t?? ta???a
Explanations Ratio (ER) - ??????µe ?at?f?? t?? ?at?te?? ßa?µ?????a p??
?e??e?ta? ?et??? a???????s?. - St? p?s?st? t?? ???t???te??? ?e?t????
p??sµet????ta? ?s?? ????? ßa?µ?????se? t?? ta???a
p??? ap? t? ?at?f??. G?a 1ltiltK, a?
PositiveCounter PositiveCounter1
20Experiments Content based
- ?? a????? dataset p?? ???s?µ?p??????e pe????e?
943 ???ste? ?a? 1682 ta???e?. - G?a t?? e?pa?de?s? t?? ??a ???e ???st?
???s?µ?p??????e t? 50 t?? ta????? p?? ??e?
ßa?µ?????se? ? ???st??. - St???? e??a? ? e??es? t?? dataset p?? ?a d??e?
?a??te?a ap?te??sµata st?? Collaborative
Filtering a??????µ? ?a? ?at ep??tas? st??
?ß??d???. - G?a ???e ???st? t? ??? µet?????e ??
- µe t? µ??e??? t?? test set.
21Experiments Content based
- ?e??aµa 1 ?p?d?s? CB a?????a µe t? s????? t??
ßa?µ??????? a?? ???st?.
22Experiments Content based
- ?e??aµa 2 ?p?d?s? CB a?????a µe ta features p??
???s?µ?p?????ta? ??a ?a pe????????? µ?a ta???a.
23Experiments Content based
- ?e??aµa 3 ?p?d?s? CB a?????a µe t? p????? t??
ßa?µ??????? a?? ta???a (µe user_ratings / user
gt 40). - ??att??eta? p???
- t? p????? t??
- ratings! -gt ?
- CF de? µp??e?
- ?a e???e?
- ap?te??sµata!
24Experiments Content based
- ?e??aµa 4 ?p?d?s? CB ??t??ta? ???t???a sta
features t?? ta????? (µe user_ratings / user gt
40). - ??att????ta? p???
- t? p????? ta?????
- ?a? ???st?? -gt ?
- CF de? µp??e?
- ?a e???e?
- ap?te??sµata!
25Experiments Collaborative filtering
- G?a ?a µp??e? ?a d?se? p??ß???e??, sta pe???µata
s?µµete??a? ?? ???ste? µe pe??ss?te?e? ap? 40
ßa?µ?????e? 622 ap? t??? 943. - ?? 200 p??t?? ???ste? ???s?µ?p?????ta? p??ta ??a
training ?a? ??a p?s?st? ap? t??? te?e?ta???? ??a
evaluation. - ?p? t??? evaluation users ??f???e ?p ??? ??a
µ???? t?? ßa?µ??????? t??? (Evaluation Ratings
Per User - ERPU).
26Experiments Collaborative filtering
- 1? set pe??aµ?t?? ??a d?af??et???? t?µ?? t??
pa?aµ?t??? ? (0lt?lt1). - ? 0 ? collaborative filtering ???s?µ?p??e? t??
ßa?µ?????e? t?? ???t???te??? ?e?t???? ??a
p??ß???e??. - ? 1 ???s?µ?p??e? t?? µ?se? t?µ?? t??
ßa?µ??????? t?? ???t???te??? ?e?t????.
27Experiments Collaborative filtering
- 2? set pe??aµ?t?? ??a d?af??et???? t?µ?? t?? ERPU
(5, 7, 10, 12, 15, 17 ?a? 20 ).
28Experiments - Hybrid
- St? 1? set pe??aµ?t?? ?µ??a µe t? 1? set
pe??aµ?t?? collaborative filtering a????aµe t??
t?µ?? t?? pa?aµ?t??? ?. - ?? 2? set pe??aµ?t?? e??a? ?d?? µe t? 2? set t??
collaborative filtering. ?????aµe t?? t?µ?? t??
ERPU. - ?? 3? set pe??aµ?t?? e??a? ?d?? µe t? 3? set
pe??aµ?t?? t?? content based. - ??? ?a ep??easte? ? ?ß??d???? ap? t?? a??a?????
29Experiments - Hybrid
- ? ?ß??d???? de? ep??e??eta? ap? t?? a??a??? st??
collaborative filtering!!
30Experiments - Hybrid
- ?a??te?? ep?d?s? ??a ERPU5.
31Experiments - Hybrid
- ? ?ß??d???? de? ep??e??eta? ap? t?? a??a??? st??
content based!!
32Hybrid vs Collaborative filtering
- ? ?ß??d???? ??e? ?a??te?? ap?d?s? ?a? p??
sta?e?? p??e?a. - ??s? MAE(colfilt)1,12 ??s? ???(hybrid)0,89
- ??p??? ap????s?(colfilt)0,014 ??p???
ap????s?(hybrid)0,067
33References
- http//developer.apple.com/opensource/server/strea
ming/index.html - http//tomcat.apache.org/
- http//www.imdb.com
- http//www.grouplens.org/node/73attachments
- http//www.mortbay.org/jetty/
- http//www.mozilla.com/en-US/firefox/firefox.html
- http//www.apple.com/quicktime/
- http//developer.apple.com/quicktime/qtjava/
- http//backstage.bbc.co.uk/data/7DayListingData?v
16wk - http//www.tv-anytime.org/
- http//www.bbc.co.uk/opensource/projects/tv_anytim
e_api/ - Mooney, R. J., P. N. Bennett, and L. Roy. Book
recommending using text categorization with
extracted information. In Recommender Systems.
Papers from 1998 Workshop. Technical Report
WS-98-08. AAAI Press, 1998. - Gui-Rong Xue, Chenxi Lin, Qiang Yang, WenSi Xi,
Hua-Jun Zeng , Yong Yu and Zheng Chen, Scalable
Collaborative filtering Using Cluster-based
smoothing . In Proceedings of the 2005 ACM SIGIR
Conference, Salvador, Brazil, 2005, pp. 114-121 - http//www.clustan.com/k-means_critique.html
- Robin Burke. Hybrid Recommender Systems Survey
and Experiments. California State University,
Fullerton 2002. - Robin Burke. Integrating Knowledge-based and
Collaborative-filtering Recommender Systems. In
Workshop on AI and Electronic Commerce, AAAI
1999.