Title: Searching
1Searching
- Google page rank and anchor text
- Hits hubs and authorities
- MSNs Ranknet learning to rank
- Todays web dragons
2How to search Googles pagerank
Pagerank Anchor text
3Chart of the web
vs random searcher
4Google search anchor text
Pagerank Anchor text
methis is the best page ever
me
youthat is the best page ever
5HITS hubs and authorities
Principal eigenvector ? strongest communityOther
eigenvectors ? other communities
6Using HITS Asks Teoma
Web communities jaguar jaguar
jaguar jaguar team jaguar
jaguar
7Using HITS Asks Teoma
Web communities jaguar jaguar
jaguar jaguar team jaguar
Hub scores (lists of resources)Authority scores
(target pages)
helps to deal with synonyms pull in other
relevant pages (e.g. Toyota is authority for
auto manufacturers yet doesnt contain the term)
8Learning to rank MSNs Ranknet
Training set queries with matching documents
from human judgesDiscriminant function e.g.
weighted sum of features, plus thresholdMachine
learning learn the weightsApply to real queries
17,000 queries10 documents/query human
judgement (15)600 featurespairs of docs with
same query which is more highly ranked?train a
neural net (1-layer, 2-layer)
Results? Pretty good
9Todays web dragons
49 Google 1998 200423 Yahoo 1994 1996 Inktomi
2002 AltaVista 200310 MSN 2005 7 AOL Excite
since 1997, Google since 2002 2 Ask (Jeeves)
Teoma 2001