??a? ? - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

??a? ?

Description:

... – PowerPoint PPT presentation

Number of Views:2
Avg rating:3.0/5.0
Slides: 34
Provided by: gio47
Category:
Tags: sigir

less

Transcript and Presenter's Notes

Title: ??a? ?


1
?T???? ??? ??????S?????? ??????S?????
?T??O? S???? T????O? ???S???O? ????? ?????F?????S
??? ??????????O??O? ???G????? ???????????O?
S????O? ????O?????? ??G?S??
  • ??a? ?ß??d???? a??????µ?? ??a s?st?se?? ta?????
    se
  • s?st?µata d?ad?ast???? t??e??as??.
  • G?a??a??p????? ?a?a???ta - Spa?? ?????s?a

2
System architecture- Block diagram
Server
Subscriber Profiles/ Ratings
Media Content System
3
System architecture Server side
  • Streaming Server ? open source Darwin Streaming
    Server t?? Apple pa???e? VoD ?a? Live Streaming
    ?p??es?e? 1.
  • Media Content System T? s?st?µa st? ?p???
    ap????e???ta? ta a??e?a video st?? streaming
    server.
  • Media Content Management System ???sd?de? ??a?
    µ??ad??? ??d??? se ???e ??a ta???a p?? e?s??eta?
    st? media content system, t? CRID (Content
    Reference IDentifier).
  • Subscribers Profiles/ Ratings ??s? ded?µ????
    p?? pe????e? ta p??s?p??? st???e?a t?? ???st??
  • ???µa, ep??et?, username, password, ?????a, f???,
    ep???e?µa ?a? ??a s????? ap? e?d? ta????? p??
    a??s??? st? ???st?.
  • ?a? t?? ßa?µ?????e? t???
  • ?? ßa?µ?????e? d????ta? e?te ?µµesa e?te ?µesa
    ap? t??? ???ste?.
  • ?p?te?e?ta? ap? t??? p??a?e? users, profiles,
    ratings ?a? cbratings.

4
System architecture Server side
  • Profiles Control System S?st?µa a?a?????s?? t??
    ta?t?t?ta? t?? ???st? (user authentication/sign
    in) ?a? e???af?? t?? (registration).
  • Web Server Apache Tomcat web server 2
  • Semantics Component ??µµ?t? t?? s?st?µat?? p??
    ep???????e? µe e??te????? p???? ?a? a?t?e?
    p????f???e? ??a t?? ta???e?. ? p??? p????f?????
    e??a? t? IMDB 3. ?a ded?µ??a p?? a?t?e? ??a
    ???e ta???a e??a?
  • ? t?t???, ta e?d? t?? (genres), ? p????, ?
    p??ta????st??, t? ?t?? pa?a?????, ? s??????t??, ?
    pa?a?????, ? se?a??????f?? ?a? ? ?µ????µe??
    ???ssa.

5
System architecture Server side
  • Data Set MovieLens dataset 4. ???s?µ?p??????e
    ??a ta pe???µata ?a? t?? a???????s? t??
    s?st?µat??. ?p?te?e?ta? ap? 100K ßa?µ?????e? p??
    d????a? ap? 943 ???ste? ??a 1682 ta???e?. ??
    dataset ap?te?e?ta? ap? 3 p??a?e?
  • users (UserIDGenderAgeOccupationZipCode),
  • movies (MovieIDTitleGenres) and
  • ratings(UserIDMovieIDRatingTimestamp)
  • Recommendation System S??d?asµ?? d?? a??????µ??
    se ??a? ?ß??d??? p?? p??te??e? ta???e? st???
    ???ste?. ? s??d?asµ?? a??????µ?? ??e? ap?de???e?
    ?t? d??e? ße?t??µ??a ap?te??sµata se s??s? µe
    a?t? p?? ?a ?d??e ???e a??????µ?? ?e????st?. ??
    a??????µ?? p?? s??d??st??a? e??a? ??
    collaborative filtering ?a? content based.
    ?p?p???? µe t?? s?st?se?? ta????? pa???eta? st?
    ???st? ?a? epe????s? ??a t? s?stas?.

6
System architecture Client side
  • Proxy Server G?a ???? t?? EPG data ap? t? site
    t?? BBC e??atast????e ? Jetty 5 open source
    web server.
  • Web Browser Mozilla Firefox 3.0.5 web browser
    6.
  • Media Player G?a a?apa?a???? video e?s?µat????e
    st? web browser ? QuickTime Player Pro 7.5 7. ?
    client e????e? t?? media player µe t? QTJava API
    8.
  • BBC ?a???e? t? p????aµµa t?? BBC TV ?a? Radio
    ??a 7 ?µ??e? 9 se TvAnytime format 10,11.

7
Content based
  • ???p??????e ??a? bag-of-words naive Bayesian text
    classifier p??
  • epe?t????e ?a ?e????eta? d?a??sµata ap? s?????
    ???e?? 12.
  • ? ape?????s? t?? ta????? d?a?????eta? sta ped?a
    (slots) p??ta????st??, e?d?? ta???a?,
    s??????t??, pa?a?????, s????af?a?, ?????????a
    pa?a?????, ???ssa ta???a?.
  • ??a ta???a µp??e? ?a ??e? pe??ss?te?e? ap? µ?a
    t?µ?? se ???sµ??a slots ? (bag-of-words).
    ??t?p??s?pe?eta? s??????? ap? ??a d????sµa ap?
    bag-of-words.
  • ???ß??µa p??a??t???? ?at??????p???s?? 5 ???se?? ?
    ?p??????eta? ? p??a??t?ta µ?a? ta???a? ?a ??ße?
    ßa?µ?????a ap? 1 ??? 5.

8
Content based
  • F?s? e?pa?de?s?? (??µ???s? p??f?? ???st??)
  • ??µ??????a t?? bags-of-words ???? t?? slots (??e?
    ?? d??at?? ???e?? t?? slot).
  • G?a ???e ???st?
  • ???es? s?????? ta????? N p?? ??e? ßa?µ?????se?
    (training movies ).
  • ???es? t?? p??a??t?ta? ???e ???s??
  • ??a ?p??
  • ???es? t?? ?p?-s?????? p??a??t?t?? ???e ?????
    t?? slot,
  • ded?µ???? ?t? a???e? st?? ???s?
    ?a? st? slot

  • ?p??
    ,

  • p????? ???e?? s???? t?? m slot t?? ta???a?,

  • f???? eµf???s?? t?? ????? t?? ta???a?
    st? slot
  • Smoothing t?? µ?de????? p??a??t?t?? ???e??.

9
Content based
  • F?s? p??ß?e??? (???ß?e?? ßa?µ???????)
  • G?a ??e? t?? ta???e? t?? ß?s??
  • ???es? t?? posterior p??a??t?t?? ??a t?? ta???a ?
    ??a ??e? t?? ???se?? µe ß?s? t?? ?a???a t??
    Bayes.
  • ????es? t?? ta???a? ? st?? ???s? c p??
    ?p?????st??e ? µe?a??te?? p??a??t?ta (rating
    c).
  • ??µ??????a p??a?a user-ratings.

10
Content based - p??ß??µata
  • ?e?????sµ??? a????s? pe??e??µ???? ?e????????ta?
    ap? ta ?a?a?t???st??? ?????sµata p?? e??a?
    s??deµ??a µe ta a?t??e?µe?a. ?? d?? d?af??et???
    a?t??e?µe?a a?t?p??s?pe???ta? ap? t? ?d?? s?????
    ?a?a?t???st???? ?????sµ?t??, e??a? ?µ??a.
  • ?pe?e?d??e?s? ?p????? ?a s?st????? µ???
    a?t??e?µe?a p?? s?µe?????? ????? s??? se s??s? µe
    t? p??f?? t?? ???st? ? ? ???st?? pe??????eta? st?
    s?stas? a?t??e?µ???? pa??µ??a µe e?e??a p?? ??e?
    ?d? ßa?µ?????se?.
  • Cold-Start Problem ? ???st?? p??pe? ?a
    ßa?µ?????se? ??a? ??a??p???t??? a???µ?
    a?t??e?µ???? ?ste ?a ?ata??????? ?? p??t?µ?se??
    t?? ???st?.

11
Collaborative filtering
  • ?fa?µ?st??e ? Collaborative filtering with
    cluster based smoothing 13.
  • ???a? ??a? memory based Pearson correlation
    coefficient
  • a??????µ??. S????? a??????µ??
  • Clustering ?µad?p???s? t?? ???st?? t?? dataset
    se ? ?µ?de? µe k-means 14 ?a? ???t???? t?
    s????t?s? s?s??t?s?? Pearson
  • ? k-means te?µat??e? ?ta? e?a??st?p????e? ?
    a???µ?? t?? ???st?? p?? a??????? cluster a??µesa
    se d?? epa?a???e??.
  • Data smoothing S?µp???????ta? ?? ßa?µ?????e?
    ???e ???st? ??a t?? ta???e? t?? dataset p?? de?
    ??e? ßa?µ?????se?. ?? ??e? ßa?µ?????e? ßas????ta?
    st?? ßa?µ?????e? p?? ????? d?se? ?? ?p????p??
    ???ste? t?? cluster t??.

12
Collaborative filtering
  • ???ep????? t?? ?e?t???? ??ta?? ???e ???? ???st?
    se ??a ap? ta clusters. ?????? ?p??????eta? ?
    s?s??t?s? t?? µe ??a ta centroids. ?e????
    t?p??ete?ta? st? cluster µe t? ?p??? ??e? t?
    µe?a??te?? s?s??t?s?.
  • ?p????? t?? ???t???te??? ?e?t???? ?? s????? t??
    ???t???te??? ?e?t???? e??a? ?? ? ???ste? t??
    cluster µe t??? ?p????? ? ???? ???st?? ??e? t?
    µe?a??te?? ?µ???t?ta.
  • ???ß?e?? ßa?µ?????a? ???ß?e?? t?? ßa?µ?????a?
    t?? ???? ???st? ??a ??e? t?? ta???e? t?? dataset
    µe ß?s? t?? ßa?µ?????e? t?? ???t???te??? ?e?t????.

13
Collaborative filtering - p??ß??µata
  • Cold-Start Problem ??µ?????e?ta? ap? t??? ?????
    ???ste? st? s?st?µa p?? de? ????? ßa?µ?????se?
    ?a???a a?t??e?µe?? ? t? s?st?µa ad??ate? ?a ß?e?
    pa??µ????? ???ste?-?e?t??e? ?a? de? µp??e? ?a
    ???e? p??ß???e??.
  • First-Rater Problem ??µ?????e?ta? ap? ta ??a
    a?t??e?µe?a st? s?st?µa p?? de? ?????
    ßa?µ??????e? ap? ?a???a ???st?. ? CF e?a?t?ta?
    ap???e?st??? ap? t? ßa?µ?????a t?? ????? ???st??
    ? t? s?st?µa de? µp??e? ?a d?µ??????se? p??t?se??
    st??? ???ste? ??a a?t? ta a?t??e?µe?a µ???? ?a
    ??ß??? ??a? ??a??p???t??? a???µ? ßa?µ???????.
  • Gray sheep Problem ? ???st?? s?µp?pte? µe ta
    s????a µeta?? t?? ?pa????t?? ???se?? t?? ???st??
    ? de? µp??e? ?a ta????µ??e? se ?aµ?a ???s?!
  • Data Sparsity ?fe??eta? st? ?e????? ?t? ??
    ???ste? ßa?µ??????? µ??? ??a µ???? a???µ?
    a?t??e?µ????.

14
Hybrid approach Which are the problems to solve?
  • ???ß??µa t?? content based e??a? ?t? p??te??e?
    µ??? pa??µ??e? ta???e? µe a?t?? p?? ? ???st??
    ??e? ?d? de?. ?p?s?? p?s?e? ap? t? cold start
    p??ß??µa.
  • ?????? t?? collaborative filtering e??a? t? cold
    start, t? data sparsity ?a? t? gray sheep
    p??ß??µa.
  • ??? ?a t??? s??d??s??µe ?ste ? ??a? ?a
    ?atast???e? ta p??ß??µata t?? ???????

15
Hybrid approach In which order ?
  • S?µf??a µe t?? Burke 15 t? ap?d?t???te?? s??µa
    e??a? t? se???a?? (cascade hybrid).
  • ? p??t?? a??????µ?? d??e? t?? p??ß???e?? t?? se
    µ?a se??? ap? d?ateta?µ??e? ???se?? ?0-?n 16.
    St?? ?d?a ???s? ta????µ???ta? ?? ta???e? p??
    e??a? t? ?d?? ?a??? p??t?se??.
  • O de?te??? a??????µ?? ße?t???e? ta ap?te??sµata
    t?? p??t??.
  • ?e p??a se??? ?a s??d??s??µe t??? d??
    a??????µ????

16
Hybrid approach In which order ?
  • ?st? ?t? efa?µ????µe p??ta t?? collaborative
    filtering. Ta ????µe ?a a?t?µet?p?s??µe
  • Data sparsity ?a?
  • Cold start
  • ??? a? efa?µ?s??µe p??ta t?? content based
  • Cold start
  • ??at? ? content based e?a?e?fe? t? data
    sparsity!!
  • ?p?p???? t?p??et?saµe st? f??µa e???af?? t??
    ???st? µ?a ??sta ap? e?d? ta????? p?? µp??e? ?a
    ßa?µ?????se?. ?ts? a?t?µet?p?saµe ?a? t? cold
    start p??ß??µa.

17
Explanations Why?
  • G?a ?a ??????e? ? ???st?? t? ???? p?? t??
    p??t????e ? ta???a.
  • G?a ?a eµp?ste?te? t?? p??t?se?? t?? s?st?µat??.
  • G?a ?a ?ata??ße? ?a??te?a t?? t??p? p??
    ?e?t????e? t? s?st?µa.
  • G?a ?a µ? ???e? t? ????? t?? µe ta???e? p?? de?
    ?a t?? a??s???.

18
Explanations Content based
  • ???a? ta features t?? ta???a? p?? t?? ?a??st???
    pa??µ??a µe t?? ???e? ta???e? p?? ? ???st?? ??e?
    a???????se? ?et???.
  • G?a t?? ta???a p?? p??te??e ? a??????µ??
    s????????ta? ??e? ?? ?p?-s?????? p??a??t?te? t??
    ???e?? t?? ta???a?.
  • ?? ???e slot t?? ta???a? µp??e? ?a ??e?
    pe??ss?te?e? ap? µ?a t?µ?? (bags-of-words).
  • ?? feature t?? slot µe t?µ? t? µe?a??te??
    ?p?-s?????? p??a??t?ta e??a? t? ?s???? feature
    t?? ta???a? (µe t? µe?a??te?? d?a????st???
    ??a??t?ta) ?a? ap?te?e? t? ???? ??a t?? ?p???
    p??t????e ? s???e???µ??? ta???a.

19
Explanations Collaborative Filtering
  • ???a? t? p?s?st? t?? ???t???te??? ?e?t???? p??
    ????? a???????se? ?et??? t?? ta???a
    Explanations Ratio (ER)
  • ??????µe ?at?f?? t?? ?at?te?? ßa?µ?????a p??
    ?e??e?ta? ?et??? a???????s?.
  • St? p?s?st? t?? ???t???te??? ?e?t????
    p??sµet????ta? ?s?? ????? ßa?µ?????se? t?? ta???a
    p??? ap? t? ?at?f??. G?a 1ltiltK, a?
    PositiveCounter PositiveCounter1

20
Experiments Content based
  • ?? a????? dataset p?? ???s?µ?p??????e pe????e?
    943 ???ste? ?a? 1682 ta???e?.
  • G?a t?? e?pa?de?s? t?? ??a ???e ???st?
    ???s?µ?p??????e t? 50 t?? ta????? p?? ??e?
    ßa?µ?????se? ? ???st??.
  • St???? e??a? ? e??es? t?? dataset p?? ?a d??e?
    ?a??te?a ap?te??sµata st?? Collaborative
    Filtering a??????µ? ?a? ?at ep??tas? st??
    ?ß??d???.
  • G?a ???e ???st? t? ??? µet?????e ??
  • µe t? µ??e??? t?? test set.

21
Experiments Content based
  • ?e??aµa 1 ?p?d?s? CB a?????a µe t? s????? t??
    ßa?µ??????? a?? ???st?.

22
Experiments Content based
  • ?e??aµa 2 ?p?d?s? CB a?????a µe ta features p??
    ???s?µ?p?????ta? ??a ?a pe????????? µ?a ta???a.

23
Experiments Content based
  • ?e??aµa 3 ?p?d?s? CB a?????a µe t? p????? t??
    ßa?µ??????? a?? ta???a (µe user_ratings / user
    gt 40).
  • ??att??eta? p???
  • t? p????? t??
  • ratings! -gt ?
  • CF de? µp??e?
  • ?a e???e?
  • ap?te??sµata!

24
Experiments Content based
  • ?e??aµa 4 ?p?d?s? CB ??t??ta? ???t???a sta
    features t?? ta????? (µe user_ratings / user gt
    40).
  • ??att????ta? p???
  • t? p????? ta?????
  • ?a? ???st?? -gt ?
  • CF de? µp??e?
  • ?a e???e?
  • ap?te??sµata!

25
Experiments Collaborative filtering
  • G?a ?a µp??e? ?a d?se? p??ß???e??, sta pe???µata
    s?µµete??a? ?? ???ste? µe pe??ss?te?e? ap? 40
    ßa?µ?????e? 622 ap? t??? 943.
  • ?? 200 p??t?? ???ste? ???s?µ?p?????ta? p??ta ??a
    training ?a? ??a p?s?st? ap? t??? te?e?ta???? ??a
    evaluation.
  • ?p? t??? evaluation users ??f???e ?p ??? ??a
    µ???? t?? ßa?µ??????? t??? (Evaluation Ratings
    Per User - ERPU).

26
Experiments Collaborative filtering
  • 1? set pe??aµ?t?? ??a d?af??et???? t?µ?? t??
    pa?aµ?t??? ? (0lt?lt1).
  • ? 0 ? collaborative filtering ???s?µ?p??e? t??
    ßa?µ?????e? t?? ???t???te??? ?e?t???? ??a
    p??ß???e??.
  • ? 1 ???s?µ?p??e? t?? µ?se? t?µ?? t??
    ßa?µ??????? t?? ???t???te??? ?e?t????.

27
Experiments Collaborative filtering
  • 2? set pe??aµ?t?? ??a d?af??et???? t?µ?? t?? ERPU
    (5, 7, 10, 12, 15, 17 ?a? 20 ).

28
Experiments - Hybrid
  • St? 1? set pe??aµ?t?? ?µ??a µe t? 1? set
    pe??aµ?t?? collaborative filtering a????aµe t??
    t?µ?? t?? pa?aµ?t??? ?.
  • ?? 2? set pe??aµ?t?? e??a? ?d?? µe t? 2? set t??
    collaborative filtering. ?????aµe t?? t?µ?? t??
    ERPU.
  • ?? 3? set pe??aµ?t?? e??a? ?d?? µe t? 3? set
    pe??aµ?t?? t?? content based.
  • ??? ?a ep??easte? ? ?ß??d???? ap? t?? a??a?????

29
Experiments - Hybrid
  • ? ?ß??d???? de? ep??e??eta? ap? t?? a??a??? st??
    collaborative filtering!!

30
Experiments - Hybrid
  • ?a??te?? ep?d?s? ??a ERPU5.

31
Experiments - Hybrid
  • ? ?ß??d???? de? ep??e??eta? ap? t?? a??a??? st??
    content based!!

32
Hybrid vs Collaborative filtering
  • ? ?ß??d???? ??e? ?a??te?? ap?d?s? ?a? p??
    sta?e?? p??e?a.
  • ??s? MAE(colfilt)1,12 ??s? ???(hybrid)0,89
  • ??p??? ap????s?(colfilt)0,014 ??p???
    ap????s?(hybrid)0,067

33
References
  • http//developer.apple.com/opensource/server/strea
    ming/index.html
  • http//tomcat.apache.org/
  • http//www.imdb.com
  • http//www.grouplens.org/node/73attachments
  • http//www.mortbay.org/jetty/
  • http//www.mozilla.com/en-US/firefox/firefox.html
  • http//www.apple.com/quicktime/
  • http//developer.apple.com/quicktime/qtjava/
  • http//backstage.bbc.co.uk/data/7DayListingData?v
    16wk
  • http//www.tv-anytime.org/
  • http//www.bbc.co.uk/opensource/projects/tv_anytim
    e_api/
  • Mooney, R. J., P. N. Bennett, and L. Roy. Book
    recommending using text categorization with
    extracted information. In Recommender Systems.
    Papers from 1998 Workshop. Technical Report
    WS-98-08. AAAI Press, 1998.
  • Gui-Rong Xue, Chenxi Lin, Qiang Yang, WenSi Xi,
    Hua-Jun Zeng , Yong Yu and Zheng Chen, Scalable
    Collaborative filtering Using Cluster-based
    smoothing . In Proceedings of the 2005 ACM SIGIR
    Conference, Salvador, Brazil, 2005, pp. 114-121
  • http//www.clustan.com/k-means_critique.html
  • Robin Burke. Hybrid Recommender Systems Survey
    and Experiments. California State University,
    Fullerton 2002.
  • Robin Burke. Integrating Knowledge-based and
    Collaborative-filtering Recommender Systems. In
    Workshop on AI and Electronic Commerce, AAAI
    1999.
Write a Comment
User Comments (0)
About PowerShow.com