Nessun titolo diapositiva - PowerPoint PPT Presentation

About This Presentation
Title:

Nessun titolo diapositiva

Description:

Keyword-based IR and early conceptual approaches. Context and concepts in modern topical IR ... Matching the query against document clusters (Willet 1988) ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 29
Provided by: ClaudioC67
Category:

less

Transcript and Presenter's Notes

Title: Nessun titolo diapositiva


1
Conceptual structures in modern information
retrieval
Claudio Carpineto Fondazione Ugo
Bordoni Roma carpinet_at_fub.it
2
Overview
  • Keyword-based IR and early conceptual approaches
  • Context and concepts in modern topical IR
  • Emerging IR tasks requiring knowledge structures
  • Research at FUB
  • Conclusions

3
Vector-based IR
4
Term weighting
  • tf.idf and vector space model (Salton) very
    popular
  • in70s and 80s
  • BM25 (Robertson) has been the state of the art
  • in the 90s
  • Several recent term-weighting functions based on
  • statistical language modeling (Ponte,
    Lafferty)
  • A new weighting framework based on deviation
  • from randomness information gain (FUB UG)

5
(No Transcript)
6
Inherent limitations of keyword-based IR
  • Vocabulary problem
  • Relations are ignored

7
Early approaches to conceptual IR
  • n-grams (Salton 1975, Maarek 1989)
  • parse tree (Dillon 1983, Metzler 1989)
  • case relations (Fillmore 1968, Somers 1987)
  • conceptual graphs (Dick 1991)

8
Why early conceptual IR not successful
  • No best representation scheme
  • Manual coding too costly
  • Automated coding too hard
  • Training required both for the indexer and the
    user
  • Effectiveness not clearly demonstrated
  • Retrieval task often not appropriate

9
Overview
  • Vector-based IR and early conceptual approaches
  • Context and concepts in modern topical IR
  • Emerging IR tasks requiring knowledge structures
  • Research at FUB
  • Conclusions

10
Evolution of topical IR
  • Very short queries
  • Heterogeneous collections
  • Unreliable sources
  • Interactive sessions

11
Model of modern topical IR
12
(No Transcript)
13
Performance of retrieval feedback versus query
difficulty
14
Ranking based on interdocument similarity
  • Cluster hypothesis (van Rijsbergen 1978)
  • Approaches
  • - Matching the query against document clusters
    (Willet 1988)
  • - Matching the query against transformed document
  • representations (GVSM, Wong 1987, LSI,
    Deerwester 1990)
  • Computing the conceptual distance between query
    and
  • documents (Order-theoretical ranking,
    Carpineto 2000)

15
Order-theoretical ranking
16
Performance of order-theoretical ranking
  • Better than hierarchic clustering and comparable
    to
  • best matching on the whole collection
  • Markedly better than both hierarchic clustering
    and
  • best matching on non-matching relevant
    documents
  • Order-theoretical ranking does not scale up well
    but
  • it is synergistic with best matching document
    ranking

17
Overview
  • Vector-based IR and early conceptual approaches
  • Context and concepts in modern topical IR
  • Emerging IR tasks requiring knowledge structures
  • Research at FUB
  • Conclusions

18
Question Answering
Task Closed-class questions in unrestricted
domains with no guarantee of answer and result
possibly scattered over multiple documents
19
Question Answering
  • Approach
  • Recognize type of queries
  • Retrieve relevant documents
  • Find sought entities near question words
  • Fall back to best-matching passage
  • retrieval in case of failure

20
Web Information Retrieval
21
Web Information Retrieval
Current tasks named-entity finding task topic
distillation task
  • Approach
  • Use of multiple methods
  • Combination of results via interpolation and
  • normalization schemes

22
XML document retrieval
Goal Use document structure to improve precision
and recall of unstructured queries concerts
this weekend at Sofia under 20 euros
  • Approaches
  • Automatic inference of query structure
  • Semi-automatic query annotation
  • Hybrid query languages

23
Overview
  • Vector-based IR and early conceptual approaches
  • Context and concepts in modern topical IR
  • Emerging IR tasks requiring knowledge structures
  • Research at FUB
  • Conclusions

24
Recommender systems
Related keyword feature versus Context-
dependent query reformulation
25
(No Transcript)
26
(No Transcript)
27
Combining text retrieval and text mining with
concept lattices
Goal
Integration of multiple search
strategies (querying, browsing, thesaurus
climbing, bounding) into a unique Web interface
28
Conclusions
The use of conceptual structures surfaces in
traditional topic relevance retrieval and it is
at the heart of many non-topical retrieval
tasks Towards conceptual search
  • Understand term meaning
  • Adapt to the user
  • Can translate between applications
  • Explainable
  • Capable of filtering and summarization
Write a Comment
User Comments (0)
About PowerShow.com