EvaluatingOptimizing Search Engines using Clickthrough Data - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

EvaluatingOptimizing Search Engines using Clickthrough Data

Description:

Clickthrough data seem to provide users' implicit feedback; Is them suitable for ... The criteria to get unbiased clickthrough data for comparing search engines ... – PowerPoint PPT presentation

Number of Views:157

Avg rating:3.0/5.0

Slides: 37

Provided by: shuilun

Category:

more less

Transcript and Presenter's Notes

Title: EvaluatingOptimizing Search Engines using Clickthrough Data

1
Evaluating/Optimizing Search Engines using
Clickthrough Data

Shui-Lung Chuang
Mar. 28, 2003

2
The Reference Papers

Evaluating Retrieval Performance using
Clickthrough Data
Technical Report, Cornell U., 2002
Optimizing Search Engines using Clickthrough Data
KDD-2002
Thorsten Joachims
Department of Computer Science
Cornell University

3
About the Author

Thorsten Joachims
Now Assistant professor, Dept. of CS, Cornell
University
2001 Finished his Ph.D. (1997, received a Diplom)
2000-01 As a Post Dr. (Knowledge Discovery Team)
1994-96 As a visiting scholar of Prof. Tom
Mitchell at CMU
As I know, the first guy applying support vector
machine (SVM) on text categorization (ECML-1998)
The author of SVMlight the cause of wild
popularity of SVM
available at http//svmlight.joachims.org/

4
Outline

Things about clickthrough data
Evaluating search engines using clickthrough data
Optimizing search engines using clickthrough data

5
Search Engine Logs
Where is Web page of ICDM 2002 ?
Logs
Query Terms ICDM ICDM02 ICDM 2002
ICDM ICDM02 ICDM 2002
1
Search Engine
Click through
3
2
URLs http//kis.maebashi-it.ac.jp/icdm02/
http//www.wi-lab.com/icdm02
http//www.computer.org/.../pr01754.htm
6
Clickthrough Data

Clickthrough data can be thought of as triplets
(q,r,c)
the query q
the ranking r presented to the user
the set c of links the user clicked on
E.g.,
Clickthough data provide users feedback for
relevance judgment

q support vector machine
r
c link1 link3 link7
7
A Mechanism to Record Clickthrough Data

query-log the query words, the presented ranking
click-log query-ID, clicked URL (via a proxy
server)
This process should be made transparent to the
user
This process should not influence system
performance

8
Some Works on Search-Engine Logs

Real-world search engine
Direct Hit (http//www.directhit.com)
Analyzing search vocabularies/subjects/topics
C. Silverstein, M. Henzinger, H. Marais, and M.
Moricz. Analysis of a very large altavista query
log, Technical Report, Digital Systems Research
Center, 1998.
N. Ross and D. Wolfram. End user searching on the
internet An analysis of term pair topics
submitted to the Excite search engine.
JASIS-2000.
H.-T Pu and S.-L. Chuang and C. Yang. Subject
categorization of query terms for exploring Web
users search interests. JASIS-2002, 53(8).
S.-L. Chuang and L.-F. Chien. Enriching Web
taxonomies through subject categorization of
query terms from search engine logs. DSS-2003.
Clustering query terms
D. Beeferman and A. Berger. Agglomerative
clustering of a search engine query log.
KDD-2000.
J.-R. Wen and J.-Y. Nie and H.-J. Zhang.
Clustering user queries of a search engine,
WWW-2001, ACMTOIS-2002.
S.-L. Chuang and L.-F. Chien. Towards automatic
generation of query taxonomy A hierarchical
query clustering approach. ICDM-2002.
Further . . . ?

9
Outline

Things about clickthrough data
Evaluating search engines using clickthrough data
Experiment setup for getting unbiased
clickthrough data
Theoretical analysis
Optimizing search engines using clickthrough data

10
The Problem

A problem of statistical inference hypothesis
test
Users are only rarely willing to give explicit
feedback
Clickthrough data seem to provide users implicit
feedback Is them suitable for relevance
judgment? I.e., Click ? Relevance ?

Which search engine provides better results
Google or MSNSearch?
11
EXP1 Regular Clickthrough Data

Clicks heavily depend on the ranking
(presentation bias)

12
EXP2 Unbiased Clickthrough Data

The criteria to get unbiased clickthrough data
for comparing search engines
Blind test The interface should hide the random
variables underlying the hypothesis test to avoid
biasing the users response
Click ? preference The interface should be
designed so that a click demonstrates a
particular judgment of the user
Low usability impact The interface should not
substantially lower the productivity of the user

13
EXP2 Unbiased Clickthrough Data

Top l links of the combined ranking containing
the top ka and kb links from rankings A and B
ka-kb?1.

14
Computing the Combined Ranking
15
Experiment

Google v.s. MSNSearch
Experiment data gathered from 3 users,
9/25?10/18, 2001
180 queries and 211 clicks (1.17 clicks/query,
2.31 words/query)
The top k links for each query are manually
judged for the relevance
Questions to examine
Dose the clickthrough evaluation agree with the
manual relevance judgments?
Click ? Preference?
Is the experiment design a blind test?

16
Theoretical Analysis
17
Theoretical Analysis Assumption 1

Intuitively, this assumption formalizes that
users click on a relevant link more frequently
than on a non-relevant link by a difference of ?.

18
Theoretical Analysis Assumption 2

Intuitively, the assumption states that the only
reason for a user clicking on a particular link
is due to the relevance of the link, but not to
other influence factors connected with a
particular retrieval function.

19
Statistical Hypothesis Test

Two-tailed paired t-test
binomial sign test

Please refer to the paper if you have interest
20
Clickthrough vs. Relevance
Google vs. MSNSearch (77 vs. 63) Google vs.
Default (85 vs. 18) MSNSearch vs. Default (91
vs. 12)
21
Is Assumption 1 Valid?

Assumption 1 User clicks on more relevant links
than non-relevant links on average

22
Is Assumption 2 Valid?
23
Outline

Things about clickthrough data
Evaluating search engines using clickthrough data
Experiment setup for getting unbiased
clickthrough data
Theoretical analysis and experiments
Optimizing search engines using clickthrough data
Relevance feedback by clickthrough data
A framework for learning of retrieval functions
An SVM algorithm for learning of ranking
functions
Experiments

24
An Illustrative Scenario
25
Click ? Absolute Relevance Judgment

Clickthrough data as a triplet (q,r,c)
The presented ranking r depends on the query q as
determined by the retrieval function of search
engine
The set c of clicked-on links depends on both
query q and the presented ranking r
E.g., Highly ranked links have advantages to be
clicked
A click on a particular link cannot be seen as an
absolute relevance judgment

26
Click Relative Preference Judgment

Assuming that the user scanned the ranking from
top to bottom
E.g., c link1,link3,link7 (r the ranking
preferred by the user)

link7 ltr link2 link7 ltr link4 link7 ltr
link5 link7 ltr link6
link3 ltr link2
27
A Framework for Learning Retrieval Fun.

r is optimal ordering, rf(q) is the ordering of
retrieval function f on query q r and rf(q) are
binary relations over D?D, where Dd1,,dm is
the document collection
e.g., di ltr dj, then (di,dj)? r
Kendalls ? (vs. average precision)
For a fixed but unknown distribution Pr(q,r),
the goal is to learn a retrieval function with
the maximum expected Kendalls ?

P of concordant pairs Q of discordant pairs
28
An SVM Algo. for Learning Ranking Fun.

Given an independently and identically
distributed training sample S of size n
containing queries q with their target ranking r
The learner will select a ranking function f from
a family of ranking functions F that maximize the
empirical ?

29
The Ranking SVM Algorithm

Consider the class of linear ranking functions

w is a weight vector that is adjusted by
learning, ?(q,d) is a mapping onto features
describing the match between q and d

The goal is to find a w so that
max number of following
inequalities is fulfilled

30
The Categorization SVM
Learning a hypothesis h such that
?
?
?
?
?
?
Example
?
?
w
Learning ? Optimization problem
?
?
?
Minimize
?
?
?
so that
where yi 1 (?1) if di is in class (?)
31
The Ranking SVM Algorithm (cont.)

Optimization Problem 1 is convex and has no local
optima. By rearranging the constraints as

A classification SVM on vectors

32
Experiment Setup Meta Search
Google
MSNSearch
Striver
Excite
Altavista
Hotbot
33
Features
34
Experiment Results
35
Learned Weights of Features
36
Conclusions

The first work (evaluating search engines) is
crucial
The feasibility of using clickthrough data in
evaluating retrieval performance has been
verified
The clickthrough data (less effort) perform as
well as manual relevance judgment (more effort)
in this task.
The second (KDD one) shows an interesting work on
clickthrough data
Negative comments
The approaches did not been justified in a larger
scale, so whether the techniques are workable in
real cases is still uncertain.
The links that are relevant but ranked lower
remain invisible.

37
What Can Clickthrough Data Help?

Problem 1
How to measure the retrieval quality of a search
engine? How to compare the performance between
two search engines? E.g., Which search engine
provides better results Google or MSNSearch?
Users are only rarely willing to give explicit
feedback.
Problem 2
How to improve the ranking function of search
engines?
Can we learn something like for query q,
document a should be ranked higher than document
b?