Diversifying Search Results

1 / 25
About This Presentation
Title:

Diversifying Search Results

Description:

Football team? Movie?) 'Michael Jordan' (which one?) How best to ... Quality scores of documents are estimated based on textual and link features of the webpage ... – PowerPoint PPT presentation

Number of Views:66
Avg rating:3.0/5.0
Slides: 26
Provided by: sree7

less

Transcript and Presenter's Notes

Title: Diversifying Search Results


1
Diversifying Search Results
  • Rakesh Agrawal, Sreenivas Gollapudi,Alan
    Halverson, Samuel Ieong
  • Search Labs, Microsoft Research
  • WSDM, February 10, 2009

TexPoint fonts used in EMF. Read the TexPoint
manual before you delete this box. AAAAAA
2
Ambiguity and Diversification
  • Many queries are ambiguous
  • Barcelona (City? Football team? Movie?)
  • Michael Jordan

Michael I. Jordan
Michael J. Jordan
3
Ambiguity and Diversification
  • Many queries are ambiguous
  • Barcelona (City? Football team? Movie?)
  • Michael Jordan (which one?)
  • How best to answer ambiguous queries?
  • Use context, make suggestions,
  • Under the premise of returning a single (ordered)
    set of results, how best to diversify the search
    results so that a user will find something useful?

4
Intuition behind Our Approach
  • Analyze click logs for classifying queries and
    docs
  • Maximize the probability that the average user
    will find a relevant document in the retrieved
    results
  • Use the analogy of marginal utility to determine
    whether to include more results from an already
    covered category

5
Outline
  • Problem formulation
  • Theoretical analysis
  • Metrics to measure diversity
  • Experiments

6
Assumptions
  • A taxonomy (categorization of intents) C
  • For each query q, P(c q) denote distribution of
    intents
  • ?c ? C P(c q) 1
  • Quality assessment of documents at intent level
  • For each doc d, V(d q, c) denote probability of
    the doc satisfying the intent
  • Conditional independence
  • Users are interested in finding at least one
    satisfying document

7
Problem Statement
  • Diversify(k)
  • Given a query q, a set of documents D,
    distribution P(c q), quality estimates V(d c,
    q), and integer k
  • Find a set of docs S ? D with S k that
    maximizes
  • interpreted as the probability that the set S is
    relevant to the query over all possible intentions

Find at least one relevant doc
Multiple intents
8
Discussion of Objective
  • Makes explicit use of taxonomy
  • In contrast, similarity-based CG98, CK06,
    RKJ08
  • Captures both diversification and doc relevance
  • In contrast, coverage-based Z05, C08,
    V08
  • Specific form of loss minimization Z02,
    ZL06
  • Diminishing returns for docs w/ the same intent
  • Objective is order-independent
  • Assumes that all users read k results
  • May want to optimize ?k P(k) P(S q)

9
Outline
  • Problem formulation
  • Theoretical analysis
  • Metrics to measure diversity
  • Experiments

10
Properties of the Objective
  • Diversify(k) is NP-Hard
  • Reduction from Max-Cover
  • No single ordering that will optimize for all k
  • Can we make use of diminishing returns?

11
A Greedy Algorithm
  • Input k, q, C, D, P(c q), V (d q, c)
  • Output set of documents S
  • S Ø
  • ?c ? C, U(c q) ? P(c q)
  • while S lt k do
  • for d ? D do
  • g(d q, c) ? ?c U(c q)V (d q, c)
  • end for
  • d ? argmax g(d q, c)
  • S ? S ? d
  • ?c ? C, U(c q) ? (1 - V (d q, c))U(c
    q)
  • D ? D \ d
  • end while

U(c q) conditional prob of intent c given
query q
g(d q, c) current prob of d satisfying q, c
Update the posterior
12
A Greedy Algorithm
  • Intent distribution P(R q) 0.8, P(B q)
    0.2.

U(R q)
U(B q)
0.8
0.2
0.08
0.12
0.07
D
V(d q, c)
g(d q, c)
S
  • Actually produces an ordered set of results
  • Results not proportional to intent distribution
  • Results not according to (raw) quality
  • Better results ? less needed to be shown

0.9
0.72
0.8
0.9
0.5
0.40
0.8
0.08
0.04
0.08
0.4
0.32
0.8
0.08
0.03
0.08
0.4
0.08
0.2
0.2
0.08
0.4
0.4
0.08
0.2
0.2
0.08
0.12
0.05
0.4
13
Formal Claims
  • Lemma 1 P(S q) is submodular.
  • Same intuition as diminishing returns
  • For sets of documents where S ? T, and a document
    d,
  • Theorem 1 Solution is an (1 1/e) approx from
    opt.
  • Consequence of Lemma 1 and NWF78
  • Theorem 2 Solution is optimal when each document
    can only satisfy one category.
  • Relative quality of docs does not change

14
Outline
  • Problem formulation
  • Theoretical analysis
  • Metrics to measure diversity
  • Experiments

15
How to Measure Success?
  • Many metrics for relevance
  • Normalized discounted cumulative gains at k
    (NDCG_at_k)
  • Mean average precision at k (MAP_at_k)
  • Mean reciprocal rank (MRR)
  • Some metrics for diversity
  • Maximal marginal relevance (MMR) CG98
  • Nugget-based instantiation of NDCG C08
  • Want a metric that can take into account both
    relevance and diversity

JK00
16
Generalizing Relevance Metrics
  • Take expectation over distribution of intents
  • Interpretation how will the average user feel?
  • Consider NDCG_at_k
  • Classic
  • NDCG-IA depends on intent distribution and
    intent-specific NDCG

17
Outline
  • Problem formulation
  • Theoretical analysis
  • Metrics to measure diversity
  • Experiments

18
Setup
  • 10,000 queries randomlysampled from logs
  • Queries classified acc.to ODP (level 2) F08
  • Keep only queries withat least two intents
    (900)
  • Top 50 results from Live, Google, and Yahoo!
  • Documents are rated on a 5-pt scale
  • gt90 docs have ratings
  • Docs w/o ratings are assigned random grade
    according to the distribution of rated documents

19
Experiment Detail
  • Documents are classified using a Rocchio
    classifier
  • Assumes that each doc belongs to only one
    category
  • Quality scores of documents are estimated based
    on textual and link features of the webpage
  • Our approach is agnostic of how quality is
    determined
  • Can be interpreted as a re-ordering of search
    results that takes into account ambiguities in
    queries
  • Evaluation using generalized NDCG, MAP, and MRR
  • f(relevance(d)) 2rel(d) discount(j) 1 lg2
    (j)
  • Take P(c q) as ground truth

20
NDCG-IA
21
MAP-IA and MRR-IA
22
Evaluation using Mechanical Turk
  • Created two types of HITs on Mechanical Turk
  • Query classification workers are asked to choose
    among three interpretations
  • Document rating (under the given interpretation)
  • Two additional evaluations
  • MT classification current ratings
  • MT classification MT document ratings

23
Evaluation using Mechanical Turk
24
Concluding Remarks
  • Theoretical approach to diversification supported
    by empirical evaluation
  • What to show is a function of both intent
    distribution and quality of documents
  • Less is needed when quality is high
  • There are additional flexibilities in our
    approach
  • Not tied to any taxonomy
  • Can make use of context as well

25
Future Work
  • When is it right to diversify?
  • Users have certain expectations about the
    workings of a search engine
  • What is the best way to diversify?
  • Evaluate approaches beyond diversifying
    theretrieved results
  • Metrics that capture both relevance and diversity
  • Some preliminary work suggests that there will be
    certain trade-offs to make

26
Thanks
  • rakesha, sreenig, alanhal, sieong_at_microsoft.com
Write a Comment
User Comments (0)