Guihong Cao, JianYun Nie and Jing Bai

1 / 31
About This Presentation
Title:

Guihong Cao, JianYun Nie and Jing Bai

Description:

Mixture model: interpolate original query model with the pseudo-relevance feedback model ... Interpolated absolute discount smoothing. Estimating the WordNet model ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Guihong Cao, JianYun Nie and Jing Bai


1
Using Markov Chains to Exploit Word Relationships
in Information Retrieval
  • Guihong Cao, Jian-Yun Nie and Jing Bai
  • Department of Computer Science and Operation
    Research University of Montreal

2
Outline
  • Introduction
  • Statistical Language Models (SLMs) for IR
  • Smoothing of Language Models
  • Previous work to use word relationships for IR
  • General Language Model based on Markov Chains
  • Conclusion and Future Work

3
SLM for IR
  • Model the relevance in two ways
  • Query likelihood with respect to a language model
    from the document
  • KL-divergence (cross entropy) between the query
    model and document model
  • Retrieval Problem Document/Query model
    estimation
  • Smoothing is an important problem avoid zero
    probability
  • Interpolation between MLE document model and
    collection model, i.e., P(qiD) aPml(qiD)(1-
    a)Pml(qiC)
  • Collection model language model estimated from
    whole document collection

Document model
4
Effect of smoothing?
  • D Tsunami, ocean, Asia, Qnatural disaster
  • Smoothing probability redistribution
  • Redistribution uniformly/according to collection
    (also to unrelated terms)

Tsunami ocean Asia computer
disaster
5
Desired effect
  • Using Tsunami ? disaster
  • Knowledge-based/Semantic smoothing
  • Relationships between terms

Tsunami ocean Asia computer
disaster
6
Outline
  • Introduction
  • Previous work to use word relationships for IR
  • Document Expansion
  • Query Expansion
  • Limitations
  • General Language Model based on Markov Chains
  • Conclusion and Future Work

7
Document Expansion
  • Inference from a document term to a different
    query term
  • Translation model
  • Inference w?qi
  • Key issue estimate the translation probability
    t(qiw)
  • Estimation of the translation model
  • Translation model (Berger et al, 1999)
  • IBM1 with synthesized data for training
  • Title language model (Jin et al, 2002)
  • A title is viewed as query relevant to the
    document
  • Train translation model with IBM1 and
    document-title pair
  • Nature co-occurrence

8
Query Expansion
  • Inference from one query term to a new query
    term sharing more terms with the document
  • Using word relationships
  • Use co-occurrence and information flow to do
    query expansion Bai et al., 2005 co-occurrence
    information
  • Use WordNet Voorhees, 1994 Liu et al, 2004
    semantic information
  • Pseudo-relevance feedback Xu et al., 1996
  • Treat top-n documents in first retrieval as
    relevant documents and update the query model
    based on the documents
  • Significant improvement
  • Opposite inference of document expansion
  • Should be complementary

9
Limitations of previous work
  • Limits to one aspect to extract word
    relationships statistical methods or semantic
    methods
  • Deals with one aspect of document expansion or
    query expansion
  • Opposite inference processes complementary
  • Limits to one-step inference use the direct
    relationships
  • e.g., contract ? agree ? negotiation
  • ? contract ? negotiation

10
Outline
  • Introduction
  • Previous work to use word relationships for IR
  • General Language Model based on Markov Chains
  • Model Description
  • Parameter Estimation
  • Experiments
  • Conclusion and Future Work

11
Model Description
  • General Model combining document expansion and
    query expansion
  • Document ranking formula the negative
    cross-entropy between expanded document model and
    query model
  • Special cases
  • Document Expansion
  • Query Expansion
  • Markov Chain (MC) is a mathematical tool to do
    multi-step inference
  • Represent the expanded document/query model with
    stationary distribution of the corresponding MC

12
Illustration of the General Model
  • Query expansion and document expansion are
    opposite inference process
  • The process are complementary
  • Query expansion query ? document
  • Document expansion document ? query

13
Why we use MCs?
  • Document/query expression corresponds to a MC
    words gt states word relationships ? state
    transition probabilities
  • Stationary distribution of MCs is an ideal
    representation for query/document
  • MC stationary distribution has been used in
    PageRank Brin and Page, 1998 and PP-attachment
    resolution Toutanova et al., 2004

..




14
The Process to Generate a Query/Document
  • A figure to illustrate the process to generate a
    query

15
Parameter Estimation
  • Three kinds of Parameters
  • Initial Distribution of Query/Document Expansion
    Model
  • Transition Probability of Query/Document
    Expansion Model
  • Coefficients of Query/Document Expansion Model
  • Parameters for document expansion VS query
    expansion
  • Different initial distributions P0(wiD) VS
    P0(wiQ)
  • Different state transition probabilities
    P(wiwj,D) VS P(wiwj,Q)
  • Similar methods to estimate coefficients global
    optimization methods

16
Initial Distribution P0(wiQ) and P0(wiD)
  • Query model
  • The prior distribution of query terms
  • Mixture model interpolate original query model
    with the pseudo-relevance feedback model
  • e.g.,
  • Document model
  • Interpolated the document model with the
    collection model
  • e.g.,

17
Transition Probability P(wiwj,Q) and P(wiwj,
D)
  • Query model
  • Transition probabilityword relationship
  • Feedback documents are more informative
  • from feedback documents
  • from the whole collection
  • Document model
  • Assume the transition probability is independent
    of the document
  • e.g.,

18
Defining Word Relationships
  • Combine different word relationships (resources)
    via language smoothing
  • A probabilistic framework makes it possible to
    estimate the parameters automatically
  • Here combine co-occurrence (statistical
    information) and WordNet (semantic information )
    two-component mixture model
  • e.g.,
  • Pco is the co-occurrence model and PWN is the
    WordNet model
  • The two models are complementary
  • is estimated automatically according to
    specific contexts

19
Illustration of Different Word Relationships
w
i
Co
-
occurrence
Other
WordNet
Relation
Relation
Relationships
w
j
20
Defining Word Relationships(Cont.)
  • Estimating the co-occurrence model
  • Terms co-occur within a window (8 words)
  • Interpolated absolute discount smoothing
  • Estimating the WordNet model
  • Terms should co-occur within a window (one
    paragraph)
  • Terms should be linked in WordNet
  • Interpolated absolute discount smoothing

21
Estimating Coefficients
  • Query Expansion Model
  • Combining global and local probabilities
  • Combining relations
  • Document Expansion Model
  • Combining relations
  • Global optimization method to maximize Mean
    Average Precision of training data Simulated
    Annealing algorithm

22
Outline
  • Introduction
  • Previous work to use word relationships for IR
  • General Language Model based on Markov Chains
  • Model Description
  • Parameter Estimation
  • Experiments
  • Conclusion and Future Work

23
Experimental Setting
  • Corpus Three TREC dataset
  • WSJ SJM and AP
  • WordNet WordNet2.0
  • http//wordnet.princeton.edu/obtain
  • Metrics
  • Mean average precision
  • Recall
  • T-test to test statistical significance

24
Experimental Setting (Cont.)
25
Research Problems
  • Does MC model work?
  • Experiments on query expansion
  • Is multi-step inference model better than
    one-step inference model?
  • Experiments on query expansion
  • Does the general model (document expansion
    query expansion) work?

26
Does MC Model Work?
  • UM unigram model
  • MixM mixture model, interpolation between
    original model and pseudo-relevance feedback
    model
  • MC-QE Query expansion based on Markov Chains

27
Is Multi-step Inference better than One-step
Inference?
  • Performance with various iterations

28
Does the general model (document expansion
query expansion) work?
  • UM unigram model
    QE multi-step query expansion
  • DE one-step document expansion GM
    general model to combine DE and QE

29
Outline
  • Introduction
  • Previous work to use word relationships for IR
  • General Language Model based on Markov Chains
  • Conclusion and Future Work

30
Conclusion and Future Work
  • Conclusion
  • The general model combining query expansion and
    document expansion is superior to using them
    solely
  • Incorporating multiple word relationship is
    helpful for IR performance
  • The weight of each component should be set in an
    appropriate way (SA algorithm)
  • Multi-step inference model is superior to
    one-step inference model
  • Future Work
  • Integrate more relationships syntactic
    relationships
  • Refine the initial distribution of document
    expansion with other similar documents document
    clustering

31
Thanks!
Write a Comment
User Comments (0)