Collaborative Query Profiling - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Collaborative Query Profiling

Description:

HotBot. WebCrawler. Yahoo. prison 'us citizen' charge. Excite. AltaVista. Lycos. HotBot. WebCrawler. Yahoo. US citizens foreign jails. Collaborative Query Profiling ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 20
Provided by: speed4
Category:

less

Transcript and Presenter's Notes

Title: Collaborative Query Profiling


1
Collaborative Query Profiling
  • Erik W. Selberg
  • University of Washington
  • Computer Science and Engineering

2
Outline
  • HuskySearch
  • The Vocabulary Problem
  • The Indexing Problem
  • Collaborative Query Profiling
  • Conclusions and Future Work

3
HuskySearch
  • Parallel Search Engine
  • Send query to all main services
  • AltaVista, Lycos, WebCrawler, etc.
  • Collate results into single list
  • Post-process results for quality
  • Key Idea Dont use one, use all!

4
The Vocabulary Problem
  • Semantically equivalent search phrases dont
    yield the same results
  • Different search terms yield different results
  • Different people use different search terms for
    the same question
  • People expect precise and accurate answers

5
Example
  • What were Princes last five albums, and in what
    year were they released?
  • prince
  • discography prince
  • prince albums
  • the artist formerly known as prince
  • prince album release dates

6
The Indexing Problem
  • Which terms should be used for indexing?
  • Hint All of them is a bad choice
  • Which documents are indexed?
  • Is there higher level information?

7
Example
  • Garbage terms - typos, code, etc.
  • Stop words
  • Lycos index only 100 relevant terms per page
  • AltaVista finite global list of indexable terms
  • WebCrawler search to depth 3

8
HuskySearch
9
prince discography
10
albums Prince
11
prison us citizen charge
12
US citizens foreign jails
13
Collaborative Query Profiling
  • Helping users find relevant documents with input
    from past queries
  • How are users helped?
  • Which queries are useful?
  • What are the metrics?

14
Implementation
  • Rank function twiddling augment results
  • Relevance feedback
  • Create new searchable indices
  • Include indices with HuskySearch queries
  • Adds new pages to results list
  • Augments pages returned from other sources
  • Evaluate using passive testing

15
Indexed Results
  • Observation Snippets in HuskySearch results
    pages highlight relevant terms
  • Experiment 1 Create an index of result pages
    and search them
  • Experiment 2 Create an index of good result
    pages and search them
  • Note this creates an implicit searchable query
    history

16
Collaborative Databases
  • Hypothesis pages close to user searches create
    a good web snapshot
  • Exp. 3 Create an index of pages referenced in
    results
  • Exp. 4 Create an index of references pages
    clicked on
  • Exp. 5, 6 Expand 3, 4 to all pages within 3
    links

17
Metrics
  • Clicking implies relevance
  • Clicking top ranked results implies validation
  • Or are we just measuring what weve trained users
    to do?
  • Need user testing to confirm these metrics

18
Current Status
  • Exp. 1 already integrated
  • Initial pilot study
  • Exp. 3, 5 coming quickly
  • Exp. 2, 4, 6 soon
  • Followup user study planned

19
Conclusions
  • Vocabulary and Indexing Problem make it harder to
    find information
  • Collaborative Query Profiling aids users with
    previous searches
  • Integrated databases fit into user interface
    nicely
  • Collaborative aspect allows users to leverage
    other users intelligence
Write a Comment
User Comments (0)
About PowerShow.com