EvaluatingOptimizing Search Engines using Clickthrough Data - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

EvaluatingOptimizing Search Engines using Clickthrough Data

Description:

Clickthrough data seem to provide users' implicit feedback; Is them suitable for ... The criteria to get unbiased clickthrough data for comparing search engines ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 37
Provided by: shuilun
Category:

less

Transcript and Presenter's Notes

Title: EvaluatingOptimizing Search Engines using Clickthrough Data


1
Evaluating/Optimizing Search Engines using
Clickthrough Data
  • Shui-Lung Chuang
  • Mar. 28, 2003

2
The Reference Papers
  • Evaluating Retrieval Performance using
    Clickthrough Data
  • Technical Report, Cornell U., 2002
  • Optimizing Search Engines using Clickthrough Data
  • KDD-2002
  • Thorsten Joachims
  • Department of Computer Science
  • Cornell University

3
About the Author
  • Thorsten Joachims
  • Now Assistant professor, Dept. of CS, Cornell
    University
  • 2001 Finished his Ph.D. (1997, received a Diplom)
  • 2000-01 As a Post Dr. (Knowledge Discovery Team)
  • 1994-96 As a visiting scholar of Prof. Tom
    Mitchell at CMU
  • As I know, the first guy applying support vector
    machine (SVM) on text categorization (ECML-1998)
  • The author of SVMlight the cause of wild
    popularity of SVM
  • available at http//svmlight.joachims.org/

4
Outline
  • Things about clickthrough data
  • Evaluating search engines using clickthrough data
  • Optimizing search engines using clickthrough data

5
Search Engine Logs
Where is Web page of ICDM 2002 ?
Logs
Query Terms ICDM ICDM02 ICDM 2002
ICDM ICDM02 ICDM 2002
1
Search Engine
Click through
3
2
URLs http//kis.maebashi-it.ac.jp/icdm02/
http//www.wi-lab.com/icdm02
http//www.computer.org/.../pr01754.htm
6
Clickthrough Data
  • Clickthrough data can be thought of as triplets
    (q,r,c)
  • the query q
  • the ranking r presented to the user
  • the set c of links the user clicked on
  • E.g.,
  • Clickthough data provide users feedback for
    relevance judgment

q support vector machine
r
c link1 link3 link7
7
A Mechanism to Record Clickthrough Data
  • query-log the query words, the presented ranking
  • click-log query-ID, clicked URL (via a proxy
    server)
  • This process should be made transparent to the
    user
  • This process should not influence system
    performance

8
Some Works on Search-Engine Logs
  • Real-world search engine
  • Direct Hit (http//www.directhit.com)
  • Analyzing search vocabularies/subjects/topics
  • C. Silverstein, M. Henzinger, H. Marais, and M.
    Moricz. Analysis of a very large altavista query
    log, Technical Report, Digital Systems Research
    Center, 1998.
  • N. Ross and D. Wolfram. End user searching on the
    internet An analysis of term pair topics
    submitted to the Excite search engine.
    JASIS-2000.
  • H.-T Pu and S.-L. Chuang and C. Yang. Subject
    categorization of query terms for exploring Web
    users search interests. JASIS-2002, 53(8).
  • S.-L. Chuang and L.-F. Chien. Enriching Web
    taxonomies through subject categorization of
    query terms from search engine logs. DSS-2003.
  • Clustering query terms
  • D. Beeferman and A. Berger. Agglomerative
    clustering of a search engine query log.
    KDD-2000.
  • J.-R. Wen and J.-Y. Nie and H.-J. Zhang.
    Clustering user queries of a search engine,
    WWW-2001, ACMTOIS-2002.
  • S.-L. Chuang and L.-F. Chien. Towards automatic
    generation of query taxonomy A hierarchical
    query clustering approach. ICDM-2002.
  • Further . . . ?

9
Outline
  • Things about clickthrough data
  • Evaluating search engines using clickthrough data
  • Experiment setup for getting unbiased
    clickthrough data
  • Theoretical analysis
  • Optimizing search engines using clickthrough data

10
The Problem
  • A problem of statistical inference hypothesis
    test
  • Users are only rarely willing to give explicit
    feedback
  • Clickthrough data seem to provide users implicit
    feedback Is them suitable for relevance
    judgment? I.e., Click ? Relevance ?

Which search engine provides better results
Google or MSNSearch?
11
EXP1 Regular Clickthrough Data
  • Clicks heavily depend on the ranking
    (presentation bias)

12
EXP2 Unbiased Clickthrough Data
  • The criteria to get unbiased clickthrough data
    for comparing search engines
  • Blind test The interface should hide the random
    variables underlying the hypothesis test to avoid
    biasing the users response
  • Click ? preference The interface should be
    designed so that a click demonstrates a
    particular judgment of the user
  • Low usability impact The interface should not
    substantially lower the productivity of the user

13
EXP2 Unbiased Clickthrough Data
  • Top l links of the combined ranking containing
    the top ka and kb links from rankings A and B
    ka-kb?1.

14
Computing the Combined Ranking
15
Experiment
  • Google v.s. MSNSearch
  • Experiment data gathered from 3 users,
    9/25?10/18, 2001
  • 180 queries and 211 clicks (1.17 clicks/query,
    2.31 words/query)
  • The top k links for each query are manually
    judged for the relevance
  • Questions to examine
  • Dose the clickthrough evaluation agree with the
    manual relevance judgments?
  • Click ? Preference?
  • Is the experiment design a blind test?

16
Theoretical Analysis
17
Theoretical Analysis Assumption 1
  • Intuitively, this assumption formalizes that
    users click on a relevant link more frequently
    than on a non-relevant link by a difference of ?.

18
Theoretical Analysis Assumption 2
  • Intuitively, the assumption states that the only
    reason for a user clicking on a particular link
    is due to the relevance of the link, but not to
    other influence factors connected with a
    particular retrieval function.

19
Statistical Hypothesis Test
  • Two-tailed paired t-test
  • binomial sign test

Please refer to the paper if you have interest
20
Clickthrough vs. Relevance
Google vs. MSNSearch (77 vs. 63) Google vs.
Default (85 vs. 18) MSNSearch vs. Default (91
vs. 12)
21
Is Assumption 1 Valid?
  • Assumption 1 User clicks on more relevant links
    than non-relevant links on average

22
Is Assumption 2 Valid?
23
Outline
  • Things about clickthrough data
  • Evaluating search engines using clickthrough data
  • Experiment setup for getting unbiased
    clickthrough data
  • Theoretical analysis and experiments
  • Optimizing search engines using clickthrough data
  • Relevance feedback by clickthrough data
  • A framework for learning of retrieval functions
  • An SVM algorithm for learning of ranking
    functions
  • Experiments

24
An Illustrative Scenario
25
Click ? Absolute Relevance Judgment
  • Clickthrough data as a triplet (q,r,c)
  • The presented ranking r depends on the query q as
    determined by the retrieval function of search
    engine
  • The set c of clicked-on links depends on both
    query q and the presented ranking r
  • E.g., Highly ranked links have advantages to be
    clicked
  • A click on a particular link cannot be seen as an
    absolute relevance judgment

26
Click Relative Preference Judgment
  • Assuming that the user scanned the ranking from
    top to bottom
  • E.g., c link1,link3,link7 (r the ranking
    preferred by the user)

link7 ltr link2 link7 ltr link4 link7 ltr
link5 link7 ltr link6
link3 ltr link2
27
A Framework for Learning Retrieval Fun.
  • r is optimal ordering, rf(q) is the ordering of
    retrieval function f on query q r and rf(q) are
    binary relations over D?D, where Dd1,,dm is
    the document collection
  • e.g., di ltr dj, then (di,dj)? r
  • Kendalls ? (vs. average precision)
  • For a fixed but unknown distribution Pr(q,r),
    the goal is to learn a retrieval function with
    the maximum expected Kendalls ?

P of concordant pairs Q of discordant pairs
28
An SVM Algo. for Learning Ranking Fun.
  • Given an independently and identically
    distributed training sample S of size n
    containing queries q with their target ranking r
  • The learner will select a ranking function f from
    a family of ranking functions F that maximize the
    empirical ?

29
The Ranking SVM Algorithm
  • Consider the class of linear ranking functions

w is a weight vector that is adjusted by
learning, ?(q,d) is a mapping onto features
describing the match between q and d
  • The goal is to find a w so that
  • max number of following
  • inequalities is fulfilled

30
The Categorization SVM
Learning a hypothesis h such that
?
?
?
?
?
?
Example
?
?
w
Learning ? Optimization problem
?
?
?
Minimize
?
?
?
so that
where yi 1 (?1) if di is in class (?)
31
The Ranking SVM Algorithm (cont.)
  • Optimization Problem 1 is convex and has no local
    optima. By rearranging the constraints as
  • A classification SVM on vectors

32
Experiment Setup Meta Search
Google
MSNSearch
Striver
Excite
Altavista
Hotbot
33
Features
34
Experiment Results
35
Learned Weights of Features
36
Conclusions
  • The first work (evaluating search engines) is
    crucial
  • The feasibility of using clickthrough data in
    evaluating retrieval performance has been
    verified
  • The clickthrough data (less effort) perform as
    well as manual relevance judgment (more effort)
    in this task.
  • The second (KDD one) shows an interesting work on
    clickthrough data
  • Negative comments
  • The approaches did not been justified in a larger
    scale, so whether the techniques are workable in
    real cases is still uncertain.
  • The links that are relevant but ranked lower
    remain invisible.

37
What Can Clickthrough Data Help?
  • Problem 1
  • How to measure the retrieval quality of a search
    engine? How to compare the performance between
    two search engines? E.g., Which search engine
    provides better results Google or MSNSearch?
  • Users are only rarely willing to give explicit
    feedback.
  • Problem 2
  • How to improve the ranking function of search
    engines?
  • Can we learn something like for query q,
    document a should be ranked higher than document
    b?
Write a Comment
User Comments (0)
About PowerShow.com