Relevance Models In Information Retrieval - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Relevance Models In Information Retrieval

Description:

In Information Retrieval. Victor Lavrenko and W. Bruce Croft ... Department of Computer Science University of Massachusetts Amherst. Presenter : Chia-Hao Lee ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 42
Provided by: KOI6
Category:

less

Transcript and Presenter's Notes

Title: Relevance Models In Information Retrieval


1
Relevance Models In Information Retrieval
  • Victor Lavrenko and W. Bruce Croft
  • Center for Intelligent Information Retrieval
  • Department of Computer Science University of
    Massachusetts Amherst

Presenter Chia-Hao Lee 
2
Outline
  • Introduction
  • Related Work
  • Relevance Models
  • Estimating a Relevance Model
  • Experimental Results
  • Conclusion

3
Introduction
  • The field of information retrieval has been
    primarily concerned with developing algorithms to
    identify relevant pieces of information in
    response to a users information need.
  • The notion of relevance is central to information
    retrieval, and much research in area has forced
    on developing formal models of relevance.

4
Introduction (cont.)
  • One of the most popular models, introduced by
    Robertson and Sparck Jones, ranks documents by
    their likelihood of belonging to the relevant
    class of documents for a query.
  • More recently, in the language modeling approach,
    the advantage of this approach is to shift from
    developing heuristic weights
    for representing term importance to instead focus
    on estimation techniques for the document model.

5
Related Work
  • Classical Probabilistic Approach
  • Underlying most research on probabilistic
    models of information retrieval is probability
    ranking principle, advocated by Robertson in,
    which suggests ranking the documents D by the
    odds of their being observed in the relevant
    class .

6
Related Work (cont.)
  • Language Modeling Approach
  • Most of these approaches rank he documents
    in the collection by the probability that a query
    Q would be observed during repeated random
    sampling form the model of document D

7
Related Work (cont.)
  • Cross-Language Approach
  • Language-modeling approaches have been extended
    to cross-language retrieval by Hiemstra and de
    Jong and Xu et al.
  • The model proposed by Berger and Lafferty
    applies to the translation of a document into a
    query in a monolingual environment, but it can
    readily accommodate a bilingual environment.

8
Relevance Models
  • Define some parameter
  • V a vocabulary in some language
  • C some large collection of documents
  • R the subset of documents in C ( )
  • a relevance model to be the
    probability distribution

9
Relevance Models (cont.)
  • The primary goal of Information Retrieval systems
    is to identify a set of documents relevant to
    some query Q .
  • Unigram language models ignore any short-range
    interactions between the words in a sample of
    text, so we cannot distinguish between
    grammatical and non-grammatical samples of text.
  • The attempts to use higher-order models were few
    and did not lead to noticeable improvements.

10
Relevance Models (cont.)
  • Two approaches to document ranking the
    probability ratio, advocated by the classical
    probabilistic models ,
  • and cross-entropy.

11
Relevance Models (cont.)
  • Classical probabilistic models
  • The Probability Ranking Principle suggests that
    we should rank the documents .
  • In order of decreasing probability ratio
  • If we assume a document D to be a sequence
    independent words , the probability
    ranking principle may be expressed as a product
    of the ratios

12
Relevance Models (cont.)
  • Cross-entropy
  • Let denote the language model of the
    relevant class, and for every document D let
    denote the corresponding document language
    model.
  • Cross-entropy is a natural measure of divergence
    between two language models, defined as

13
Relevance Models (cont.)
  • Intuitively, documents with small cross-entropy
    from the relevance model are likely to be
    relevant, so we rank the documents by increasing
    cross-entropy.
  • So, we can know that cross-entropy enjoys a
    number of attractive theoretical properties.
  • One property is of particular importance suppose
    we estimate as the relative frequency of
    the word w in the user query Q.

14
Estimating a Relevance Model
  • We discuss a set of techniques that could be used
    to estimate the set of probabilities .
  • Estimation from a set of examples.
  • Estimation without example.
  • Cross-lingual estimation.

15
Estimating a Relevance Model (cont.)
  • Estimation from a set of Examples
  • Let denote the probability of
    randomly picking document D from the relevant
    set R.
  • We assume each relevant document is equally
    likely to be picked at random, so the estimate is
  • The probability of observing a word w if we
    randomly pick some word from D is simply the
    relative frequency of w in D

16
Estimating a Relevance Model (cont.)
  • Combining the estimates from the above two
    equations, the probability of randomly picking a
    document D and then observing the word w is
  • the overall probability of observing the word w
    in the relevant class

17
Estimating a Relevance Model (cont.)
  • Now suppose we have a sufficiently large, but
    incomplete subset of examples , and
    would like to estimate the relevance model
    .
  • Indeed, the resulting estimator has a
    number of interesting probabilities
  • is an unbiased estimator of
    for a random subset .
  • is the maximum-likelihood estimator
    with respect to the set of examples S .
  • is the maximum-entropy probability
    distribution constrained by S.

18
Estimating a Relevance Model (cont.)
  • Most smoothing methods center around a fairly
    simple idea
  • is a parameter that controls the degree of
    smoothing.
  • This connection allows us to interpret smoothing
    as a way of selecting a different prior
    distribution .

.
19
Estimating a Relevance Model (cont.)
  • Estimation without Examples
  • Estimation of relevance models when no training
    examples are available.
  • As a running example we will consider the task
    of ad-hoc information retrieval, where we are
    given a short 2-3 word query, indicative of the
    users information need and no examples of
    relevant documents.

20
Estimating a Relevance Model (cont.)
  • Our best bet is to relate the probability of w
    to the conditional probability of observing w
    given that we just observed
  • We are translating a set of words into a single
    word.

21
Estimating a Relevance Model (cont.)
  • Method 1 i.i.d. sampling
  • Assume that the query words and the words
    w in relevant document are sampled identically
    and independently from a unigram distribution
    .
  • We assumed that w and all are sampled
    independently and identically to each other

22
Estimating a Relevance Model (cont.)
  • Combination

23
Estimating a Relevance Model (cont.)
  • Method 2 conditional sampling
  • We fix a value of w according to some prior
    . Then perform the following process k
    times pick a distribution according to
    , the sample the query word from with
    probability .
  • The effect of this sampling strategy is that we
    assume the query words to be independent of
    each other.

24
Estimating a Relevance Model (cont.)
  • we compute the expectation over the universe C
    of our unigram models.
  • Combination

25
Estimating a Relevance Model (cont.)
Figure
26
Estimating a Relevance Model (cont.)
  • Cross-lingual Estimation
  • Goal estimate the relevance model in some
    target language, different from the language of
    the query.
  • Let be the query in the
    source language and
  • let be the unknown set of target documents
    that are relevant to that query.

27
Estimating a Relevance Model (cont.)
  • For example the probability distribution for
    every word t in the vocabulary of the target
    language.
  • An implicit assumption behind equation (2.22) is
    that there exists a joint probabilistic model
    from which we can compute the joint
    probability .
  • That and t represent words from different
    languages, and so will not naturally occur in the
    same documents.

28
Estimating a Relevance Model (cont.)
  • 1.Estimation with a parallel corpus
  • Suppose we have at our disposal a parallel
    corpus C , a collection pf document pairs ,
    where is am document in the source
    language, and is a document in the target
    language discussing the same topic as .
  • Method 1
  • Method 2

29
Estimating a Relevance Model (cont.)
  • 2.Estimation with a statistical lexicon
  • A statistical lexicon is special dictionary
    which gives the translation
    probability for every source word and every
    target word t .
  • In this case we let C be the set of target
    documents .
  • In order to compute for a source
    word in a target document .

30
Experiment
  • Experiment Setup
  • English Resources

31
Experiment (cont.)
  • Chinese Resources

32
Experiment (cont.)
  • Ad-hoc Retrieval Experiments

33
Experiment (cont.)
  • Comparison of Ranking Methods

34
Experiment (cont.)
35
Experiment (cont.)
36
Experiment (cont.)
  • Relevance Feedback

37
Experiment (cont.)
38
Experiment (cont.)
39
Experiment (cont.)
  • Cross-Lingual Experiments

40
Experiment (cont.)
41
Conclusions
  • In this work we introduced a formal framework for
    modeling the notion of relevance in Information
    Retrieval.
  • And we defined a relevance model to be the
    language model that reflects word frequencies in
    the class of documents that are relevant to some
    given information need.
Write a Comment
User Comments (0)
About PowerShow.com