Guihong Cao, JianYun Nie and Jing Bai

About This Presentation

Title:

Guihong Cao, JianYun Nie and Jing Bai

Description:

Mixture model: interpolate original query model with the pseudo-relevance feedback model ... Interpolated absolute discount smoothing. Estimating the WordNet model ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 32

Provided by: wwwetudIr

more less

Transcript and Presenter's Notes

Title: Guihong Cao, JianYun Nie and Jing Bai

1
Using Markov Chains to Exploit Word Relationships
in Information Retrieval

Guihong Cao, Jian-Yun Nie and Jing Bai
Department of Computer Science and Operation
Research University of Montreal

2
Outline

Introduction
Statistical Language Models (SLMs) for IR
Smoothing of Language Models
Previous work to use word relationships for IR
General Language Model based on Markov Chains
Conclusion and Future Work

3
SLM for IR

Model the relevance in two ways
Query likelihood with respect to a language model
from the document
KL-divergence (cross entropy) between the query
model and document model
Retrieval Problem Document/Query model
estimation
Smoothing is an important problem avoid zero
probability
Interpolation between MLE document model and
collection model, i.e., P(qiD) aPml(qiD)(1-
a)Pml(qiC)
Collection model language model estimated from
whole document collection

Document model
4
Effect of smoothing?

D Tsunami, ocean, Asia, Qnatural disaster
Smoothing probability redistribution
Redistribution uniformly/according to collection
(also to unrelated terms)

Tsunami ocean Asia computer
disaster
5
Desired effect

Using Tsunami ? disaster
Knowledge-based/Semantic smoothing
Relationships between terms

Tsunami ocean Asia computer
disaster
6
Outline

Introduction
Previous work to use word relationships for IR
Document Expansion
Query Expansion
Limitations
General Language Model based on Markov Chains
Conclusion and Future Work

7
Document Expansion

Inference from a document term to a different
query term
Translation model
Inference w?qi
Key issue estimate the translation probability
t(qiw)
Estimation of the translation model
Translation model (Berger et al, 1999)
IBM1 with synthesized data for training
Title language model (Jin et al, 2002)
A title is viewed as query relevant to the
document
Train translation model with IBM1 and
document-title pair
Nature co-occurrence

8
Query Expansion

Inference from one query term to a new query
term sharing more terms with the document
Using word relationships
Use co-occurrence and information flow to do
query expansion Bai et al., 2005 co-occurrence
information
Use WordNet Voorhees, 1994 Liu et al, 2004
semantic information
Pseudo-relevance feedback Xu et al., 1996
Treat top-n documents in first retrieval as
relevant documents and update the query model
based on the documents
Significant improvement
Opposite inference of document expansion
Should be complementary

9
Limitations of previous work

Limits to one aspect to extract word
relationships statistical methods or semantic
methods
Deals with one aspect of document expansion or
query expansion
Opposite inference processes complementary
Limits to one-step inference use the direct
relationships
e.g., contract ? agree ? negotiation
? contract ? negotiation

10
Outline

Introduction
Previous work to use word relationships for IR
General Language Model based on Markov Chains
Model Description
Parameter Estimation
Experiments
Conclusion and Future Work

11
Model Description

General Model combining document expansion and
query expansion
Document ranking formula the negative
cross-entropy between expanded document model and
query model
Special cases
Document Expansion
Query Expansion
Markov Chain (MC) is a mathematical tool to do
multi-step inference
Represent the expanded document/query model with
stationary distribution of the corresponding MC

12
Illustration of the General Model

Query expansion and document expansion are
opposite inference process
The process are complementary
Query expansion query ? document
Document expansion document ? query

13
Why we use MCs?

Document/query expression corresponds to a MC
words gt states word relationships ? state
transition probabilities
Stationary distribution of MCs is an ideal
representation for query/document
MC stationary distribution has been used in
PageRank Brin and Page, 1998 and PP-attachment
resolution Toutanova et al., 2004

..

14
The Process to Generate a Query/Document

A figure to illustrate the process to generate a
query

15
Parameter Estimation

Three kinds of Parameters
Initial Distribution of Query/Document Expansion
Model
Transition Probability of Query/Document
Expansion Model
Coefficients of Query/Document Expansion Model
Parameters for document expansion VS query
expansion
Different initial distributions P0(wiD) VS
P0(wiQ)
Different state transition probabilities
P(wiwj,D) VS P(wiwj,Q)
Similar methods to estimate coefficients global
optimization methods

16
Initial Distribution P0(wiQ) and P0(wiD)

Query model
The prior distribution of query terms
Mixture model interpolate original query model
with the pseudo-relevance feedback model
e.g.,
Document model
Interpolated the document model with the
collection model
e.g.,

17
Transition Probability P(wiwj,Q) and P(wiwj,
D)

Query model
Transition probabilityword relationship
Feedback documents are more informative
from feedback documents
from the whole collection
Document model
Assume the transition probability is independent
of the document
e.g.,

18
Defining Word Relationships

Combine different word relationships (resources)
via language smoothing
A probabilistic framework makes it possible to
estimate the parameters automatically
Here combine co-occurrence (statistical
information) and WordNet (semantic information )
two-component mixture model
e.g.,
Pco is the co-occurrence model and PWN is the
WordNet model
The two models are complementary
is estimated automatically according to
specific contexts

19
Illustration of Different Word Relationships
w
i
Co
-
occurrence
Other
WordNet
Relation
Relation
Relationships
w
j
20
Defining Word Relationships(Cont.)

Estimating the co-occurrence model
Terms co-occur within a window (8 words)
Interpolated absolute discount smoothing
Estimating the WordNet model
Terms should co-occur within a window (one
paragraph)
Terms should be linked in WordNet
Interpolated absolute discount smoothing

21
Estimating Coefficients

Query Expansion Model
Combining global and local probabilities
Combining relations
Document Expansion Model
Combining relations
Global optimization method to maximize Mean
Average Precision of training data Simulated
Annealing algorithm

22
Outline

Introduction
Previous work to use word relationships for IR
General Language Model based on Markov Chains
Model Description
Parameter Estimation
Experiments
Conclusion and Future Work

23
Experimental Setting

Corpus Three TREC dataset
WSJ SJM and AP
WordNet WordNet2.0
http//wordnet.princeton.edu/obtain
Metrics
Mean average precision
Recall
T-test to test statistical significance

24
Experimental Setting (Cont.)
25
Research Problems

Does MC model work?
Experiments on query expansion
Is multi-step inference model better than
one-step inference model?
Experiments on query expansion
Does the general model (document expansion
query expansion) work?

26
Does MC Model Work?

UM unigram model
MixM mixture model, interpolation between
original model and pseudo-relevance feedback
model
MC-QE Query expansion based on Markov Chains

27
Is Multi-step Inference better than One-step
Inference?

Performance with various iterations

28
Does the general model (document expansion
query expansion) work?

UM unigram model
QE multi-step query expansion
DE one-step document expansion GM
general model to combine DE and QE

29
Outline

Introduction
Previous work to use word relationships for IR
General Language Model based on Markov Chains
Conclusion and Future Work

30
Conclusion and Future Work

Conclusion
The general model combining query expansion and
document expansion is superior to using them
solely
Incorporating multiple word relationship is
helpful for IR performance
The weight of each component should be set in an
appropriate way (SA algorithm)
Multi-step inference model is superior to
one-step inference model
Future Work
Integrate more relationships syntactic
relationships
Refine the initial distribution of document
expansion with other similar documents document
clustering

31
Thanks!

Write a Comment

User Comments (0)