Collection Fusion - PowerPoint PPT Presentation

About This Presentation
Title:

Collection Fusion

Description:

MetaCrawler - University of Washington (Selberg Etzioni,1995) Towell Vorhees ... Merging results from different Search Engines (Web brokers) ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 12
Provided by: andre9
Learn more at: https://www.cs.jhu.edu
Category:

less

Transcript and Presenter's Notes

Title: Collection Fusion


1
Collection Fusion
-Parallel Retrieval on Different Information
Sources (e.g. Different Search
Engines, or different collections) -Merger of
results
  • MetaCrawler - University of Washington (Selberg
    Etzioni,1995)
  • Towell Vorhees

2
Collection Fusion
Merging results from different Search Engines
(Web brokers)
Lycos Altavista Infoseek
Excite Joes Bot
1 2 3 4 5 6 7 8 9 10 11 12
.99 .98 .96 .94 .94 .92 .92
4 4 4 3.5 3.2 3.0 2.1
.99 .97 .97 .95 .95 .92
Rank
-Different Methods (Good Thing) -Merge by
downloading all and rerank using private
relevance scheme
Bayes Nets
Bag of words
3
Collection Fusion
  • Issues
  • Different weighting and relevance scales
    (logarithmic, linear, different ranges)
  • No ranking or weighting in some cases
  • Different sizes of response set
  • Different biases of collections
  • Duplicate identification and removal
  • Cost (money) or latency/bandwidth (time) as
    factor in relevance ranking

4
Goal
  • Learn
  • Ranking scale
  • Ranking Reliability
  • Relevance Ratio
  • Function
  • Rank(CF) a1f(Rank(A1)) a2f(Rank(A2))

ai 1/k, with knumber of collections
May need log transfer and/or scale shift
5
Issues
-Duplicate Identification and Removal -Link
Checking (Reliability)
6
Impact on Service Provider
-Charge Per Access -Advertising Solutions?
7
Rank-Driven Collection Fusion
Rank(CF,di) S aj f(Rank(collectionj, di))
j Î collections
May need log transform or scale shift
Rank of document i in collection j
Will depend on collections overall relevance and
reliability of rankings
8
Collection 1
CF
Collection 2
Assuming -Relevance µ rank -Collection sizes are
equal -Smaller returned set? More selective
9
Collection 1
CF
Collection 2
Assuming -Relevance µ -Equal selectivity -Smalle
r returned set? Smaller collection
rank
Total Returned
10
lexcite .01 f(correlation with my judgements)
lAltavista .50
Merge using Relevance judgments of different
search engines
)
My rank or relevance
f(
service provider
Their rankings Rel judgments
Their past performance

- nature of scale used
11
Sample-Based Relevance
sample
Collection 1
.99 .96 .96 .95 .95 .94 .94 .93 .92 .91
1.00
100
ideal
System 1
System Ranking
User Ranking
System 2
User
(Collective Judgment)
System 3
0
System 4
Collection 2
.99 .96 .96 .95 .95 .94 .94 .93 .92 .91
100
0.0
1.00
0.0
System
0
Write a Comment
User Comments (0)
About PowerShow.com