Relevance Models In Information Retrieval - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Relevance Models In Information Retrieval

Description:

In Information Retrieval. Victor Lavrenko and W. Bruce Croft ... Department of Computer Science University of Massachusetts Amherst. Presenter : Chia-Hao Lee ... – PowerPoint PPT presentation

Number of Views:118

Avg rating:3.0/5.0

Slides: 42

Provided by: KOI6

Category:

more less

Transcript and Presenter's Notes

Title: Relevance Models In Information Retrieval

1
Relevance Models In Information Retrieval

Victor Lavrenko and W. Bruce Croft
Center for Intelligent Information Retrieval
Department of Computer Science University of
Massachusetts Amherst

Presenter Chia-Hao Lee
2
Outline

Introduction
Related Work
Relevance Models
Estimating a Relevance Model
Experimental Results
Conclusion

3
Introduction

The field of information retrieval has been
primarily concerned with developing algorithms to
identify relevant pieces of information in
response to a users information need.
The notion of relevance is central to information
retrieval, and much research in area has forced
on developing formal models of relevance.

4
Introduction (cont.)

One of the most popular models, introduced by
Robertson and Sparck Jones, ranks documents by
their likelihood of belonging to the relevant
class of documents for a query.
More recently, in the language modeling approach,
the advantage of this approach is to shift from
developing heuristic weights
for representing term importance to instead focus
on estimation techniques for the document model.

5
Related Work

Classical Probabilistic Approach
Underlying most research on probabilistic
models of information retrieval is probability
ranking principle, advocated by Robertson in,
which suggests ranking the documents D by the
odds of their being observed in the relevant
class .

6
Related Work (cont.)

Language Modeling Approach
Most of these approaches rank he documents
in the collection by the probability that a query
Q would be observed during repeated random
sampling form the model of document D

7
Related Work (cont.)

Cross-Language Approach
Language-modeling approaches have been extended
to cross-language retrieval by Hiemstra and de
Jong and Xu et al.
The model proposed by Berger and Lafferty
applies to the translation of a document into a
query in a monolingual environment, but it can
readily accommodate a bilingual environment.

8
Relevance Models

Define some parameter
V a vocabulary in some language
C some large collection of documents
R the subset of documents in C ( )
a relevance model to be the
probability distribution

9
Relevance Models (cont.)

The primary goal of Information Retrieval systems
is to identify a set of documents relevant to
some query Q .
Unigram language models ignore any short-range
interactions between the words in a sample of
text, so we cannot distinguish between
grammatical and non-grammatical samples of text.
The attempts to use higher-order models were few
and did not lead to noticeable improvements.

10
Relevance Models (cont.)

Two approaches to document ranking the
probability ratio, advocated by the classical
probabilistic models ,
and cross-entropy.

11
Relevance Models (cont.)

Classical probabilistic models
The Probability Ranking Principle suggests that
we should rank the documents .
In order of decreasing probability ratio
If we assume a document D to be a sequence
independent words , the probability
ranking principle may be expressed as a product
of the ratios

12
Relevance Models (cont.)

Cross-entropy
Let denote the language model of the
relevant class, and for every document D let
denote the corresponding document language
model.
Cross-entropy is a natural measure of divergence
between two language models, defined as

13
Relevance Models (cont.)

Intuitively, documents with small cross-entropy
from the relevance model are likely to be
relevant, so we rank the documents by increasing
cross-entropy.
So, we can know that cross-entropy enjoys a
number of attractive theoretical properties.
One property is of particular importance suppose
we estimate as the relative frequency of
the word w in the user query Q.

14
Estimating a Relevance Model

We discuss a set of techniques that could be used
to estimate the set of probabilities .
Estimation from a set of examples.
Estimation without example.
Cross-lingual estimation.

15
Estimating a Relevance Model (cont.)

Estimation from a set of Examples
Let denote the probability of
randomly picking document D from the relevant
set R.
We assume each relevant document is equally
likely to be picked at random, so the estimate is
The probability of observing a word w if we
randomly pick some word from D is simply the
relative frequency of w in D

16
Estimating a Relevance Model (cont.)

Combining the estimates from the above two
equations, the probability of randomly picking a
document D and then observing the word w is
the overall probability of observing the word w
in the relevant class

17
Estimating a Relevance Model (cont.)

Now suppose we have a sufficiently large, but
incomplete subset of examples , and
would like to estimate the relevance model
.
Indeed, the resulting estimator has a
number of interesting probabilities
is an unbiased estimator of
for a random subset .
is the maximum-likelihood estimator
with respect to the set of examples S .
is the maximum-entropy probability
distribution constrained by S.

18
Estimating a Relevance Model (cont.)

Most smoothing methods center around a fairly
simple idea
is a parameter that controls the degree of
smoothing.
This connection allows us to interpret smoothing
as a way of selecting a different prior
distribution .

.
19
Estimating a Relevance Model (cont.)

Estimation without Examples
Estimation of relevance models when no training
examples are available.
As a running example we will consider the task
of ad-hoc information retrieval, where we are
given a short 2-3 word query, indicative of the
users information need and no examples of
relevant documents.

20
Estimating a Relevance Model (cont.)

Our best bet is to relate the probability of w
to the conditional probability of observing w
given that we just observed
We are translating a set of words into a single
word.

21
Estimating a Relevance Model (cont.)

Method 1 i.i.d. sampling
Assume that the query words and the words
w in relevant document are sampled identically
and independently from a unigram distribution
.
We assumed that w and all are sampled
independently and identically to each other

22
Estimating a Relevance Model (cont.)

Combination

23
Estimating a Relevance Model (cont.)

Method 2 conditional sampling
We fix a value of w according to some prior
. Then perform the following process k
times pick a distribution according to
, the sample the query word from with
probability .
The effect of this sampling strategy is that we
assume the query words to be independent of
each other.

24
Estimating a Relevance Model (cont.)

we compute the expectation over the universe C
of our unigram models.
Combination

25
Estimating a Relevance Model (cont.)
Figure
26
Estimating a Relevance Model (cont.)

Cross-lingual Estimation
Goal estimate the relevance model in some
target language, different from the language of
the query.
Let be the query in the
source language and
let be the unknown set of target documents
that are relevant to that query.

27
Estimating a Relevance Model (cont.)

For example the probability distribution for
every word t in the vocabulary of the target
language.
An implicit assumption behind equation (2.22) is
that there exists a joint probabilistic model
from which we can compute the joint
probability .
That and t represent words from different
languages, and so will not naturally occur in the
same documents.

28
Estimating a Relevance Model (cont.)

1.Estimation with a parallel corpus
Suppose we have at our disposal a parallel
corpus C , a collection pf document pairs ,
where is am document in the source
language, and is a document in the target
language discussing the same topic as .
Method 1
Method 2

29
Estimating a Relevance Model (cont.)

2.Estimation with a statistical lexicon
A statistical lexicon is special dictionary
which gives the translation
probability for every source word and every
target word t .
In this case we let C be the set of target
documents .
In order to compute for a source
word in a target document .

30
Experiment

Experiment Setup
English Resources

31
Experiment (cont.)

Chinese Resources

32
Experiment (cont.)

Ad-hoc Retrieval Experiments

33
Experiment (cont.)

Comparison of Ranking Methods

34
Experiment (cont.)
35
Experiment (cont.)
36
Experiment (cont.)

Relevance Feedback

37
Experiment (cont.)
38
Experiment (cont.)
39
Experiment (cont.)

Cross-Lingual Experiments

40
Experiment (cont.)
41
Conclusions

In this work we introduced a formal framework for
modeling the notion of relevance in Information
Retrieval.
And we defined a relevance model to be the
language model that reflects word frequencies in
the class of documents that are relevant to some
given information need.

Write a Comment

User Comments (0)