Using ODP Metadata to Personalize Search - PowerPoint PPT Presentation

About This Presentation
Title:

Using ODP Metadata to Personalize Search

Description:

Extend ODP classifications from its current 4 million to a 4 billion Web ... Compare the similarity between top 100 non-biased PageRank results and biased results ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 13
Provided by: newl7
Category:

less

Transcript and Presenter's Notes

Title: Using ODP Metadata to Personalize Search


1
Using ODP Metadata to Personalize Search
  • Presented by Lan Nie
  • 09/21/2005, Lehigh University

2
Introduction
  • ODP metadata
  • 4 million sites, 590,000 categories
  • Tree Structure
  • Categories inner node
  • Pages leaf node, high quality, representative
  • Using ODP Metadata to personalize Search
  • 4 billion vs. 4 million
  • Using ODP Metadata for personalized search
  • Is biasing possible in the ODP context?
  • Extend ODP classifications from its current 4
    million to a 4 billion Web automatically by
    biasing

3
Using ODP Metadata For Personalized Search
  • User Profile several topics from ODP selected by
    user
  • Personalized Search
  • Send Q to a search Engine S(E.g., Google, ODP
    Search)
  • ResURLs returned by S
  • For i 1 to size(Res)
  • DistiDistance(Resi, Prof)
  • Resort Res based on Dist
  • Representation
  • Both user profile and URL(50 in Google
    directory) can be represented as a set of nodes
    in the directory tree
  • Distance ( Profile, URL)
  • Minimum distance between the 2 set of nodes.

4
  • Naïve Distances
  • Minimum tree distance
  • Intra-topic links
  • Subsumer
  • Graph shortest path
  • Inter-topic links
  • Complex Distance
  • The bigger the subsumers depth is, the more
    related are the nodes
  • Combing with Google PageRank
  • Some Google Results are not annotated

5
Experimental Results
6
Extending ODP Annotations To The Web
  • Manual annotation for the whole web is impossible
  • Biasing is an implicit way for extending
    annotations to the Web
  • Is basing possible in the ODP context?
  • Are ODP entries good biasing sets to obtain
    relevant results generate rankings which are
    different enough from the non-biased ranking
  • When does biasing make a difference?
  • Find the characteristics the biasing set has to
    exhibit in order to obtain relevant results

7
Experimental Setup
  • Compare the similarity between top 100 non-biased
    PageRank results and biased results
  • Similarity Measure
  • OSIM degree of overlap between the top n
    elements of two rank lists
  • KSim degree of agreement on ordering between
    the two rank lists

8
  • Choice of Biasing Sets
  • Top 0-10 PageRank pages
  • Top0-2 PageRank pages
  • Randomly selected pages
  • Low PageRank pages
  • Varied the sum of score within the set between
    0.000005 and 10 of the total sum over all pages
    (TOT).
  • Experiments are done on a crawl of 3 million
    pages, and then applied on Stanford WebBase
    crawl.

9
Biasing set consists of good pages
10
Biasing set consists of random selected pages
11
  • According to the random model of biasing, every
    set with TOT below 0.015 is good for biasing.
  • Results are not influence by the crawl size
  • (3 million crawl vs 120 million WebBase
    crawl)
  • Entries in ODP have TOT below than 0.015 thus
    biasing is possible in the ODP context

12
Conclusions
  • A Personalized search algorithm to rank urls
    based on the distance between user profile and
    url in the ODP taxonomy.
  • Biasing on ODP entries will take effect, thus it
    is feasible to extend the manual ODP
    classification to the Web is feasible
Write a Comment
User Comments (0)
About PowerShow.com