CSE 450 - PowerPoint PPT Presentation

About This Presentation
Title:

CSE 450

Description:

www.ibm.com www.ibm.co.mx. Union-Find Algorithm. Expert Selection ... Match Positions. Computing Expert Score. Condition. Atleast 1 URL with all query keywords ... – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 14
Provided by: osamaah
Category:
Tags: cse | com | match | mx

less

Transcript and Presenter's Notes

Title: CSE 450


1
CSE 450 Web Mining SeminarProfessor Brian D.
DavisonFall 2005
  • A Presentation on
  • When Experts Agree Using Non-Affiliated Experts
    to Rank Popular Topics
  • K. Bharat G. A. Mihaila
  • WWW10 Conference, May 2001, Hong Kong
  • by
  • Osama Ahmed Khan
  • 10/06/2005

2
Problem
  • Query on Popular Topic
  • Content Analysis

Solution
  • Most Authoritative Pages

3
Technical Terms
  • Expert
  • Recommendation
  • Non-affiliation

4
Hilltop Algorithm
  • Expert Lookup
  • Detecting Host Affiliation
  • Expert Selection
  • Expert Indexing
  • Target Ranking
  • Computing Expert Score
  • Computing Target Score

5
Detecting Host Affiliation
  • Conditions
  • Same first 3 octets of IP
  • 127.0.0.1 127.0.0.15
  • Same rightmost non-generic token of hostname
  • www.ibm.com www.ibm.co.mx
  • Union-Find Algorithm

6
Expert Selection
  • Retrieve all webpages with
  • Out-degree gt Threshold (k)
  • (e.g. k 5)
  • Expert will have
  • URLs pointing to k distinct non-affiliated hosts

7
Expert Indexing
  • Inverted Index
  • Mapping Keywords to Experts
  • Key Phrases
  • Match Positions

8
Computing Expert Score
  • Condition
  • Atleast 1 URL with all query keywords
  • Expert Score (S0, S1, S2)
  • Si SUMkey phrases p with k-i query terms
    LevelScore(p) FullnessFactor(p,q)
  • Expert_Score 232 S0 216 S1 S2

9
Computing Target Score
  • Condition
  • Atleast 2 non-affiliated experts
  • Target Score
  • Edge_Score(E,T) Expert_Score(E)
  • SUMquery keywords w occ(k,T)
  • Target_Score SumEdge_Score(E,T)

10
Evaluation
  1. Locating Specific Popular Targets

11
Evaluation (Contd.)
  1. Gathering Relevant Pages

12
Conclusion
  • Characteristics
  • Popular Queries
  • Expert Subset
  • Hilltop vs.
  • PageRank
  • Topic Distillation

13
  • Thank You
Write a Comment
User Comments (0)
About PowerShow.com