Title: Personalized%20Web%20Search%20Uncommon%20Responses%20to%20Common%20Queries
1Personalized Web SearchUncommon Responses to
Common Queries
- Jaime Teevan, MIT
- with Susan T. Dumais
- and Eric Horvitz, MSR
2(No Transcript)
3Personalizing Web Search
- Motivation
- Algorithms
- Results
- Future Work
4Personalizing Web Search
- Motivation
- Algorithms
- Results
- Future Work
5Study of Personal Relevancy
- 15 participants
- Microsoft employees
- Managers, support staff, programmers,
- Evaluate 50 results for a query
- Highly relevant
- Relevant
- Irrelevant
- 10 queries per person
6Study of Personal Relevancy
- Query selection
- Chose from 10 pre-selected queries
- Previously issued query
Pre-selected
cancer Microsoft traffic
bison frise Red Sox airlines
Las Vegas rice McDonalds
Mary
Joe
Total 137
53 pre-selected (2-9/query)
7Relevant Results Have Low Rank
Highly Relevant
Relevant
Irrelevant
8Relevant Results Have Low Rank
Highly Relevant
Rater 1
Rater 2
Relevant
Irrelevant
9Same Results Rated Differently
- Average inter-rater reliability 56
- Different from previous research
- Belkin 94 IRR in TREC
- Eastman 85 IRR on the Web
- Asked for personal relevance judgments
- Some queries more correlated than others
10Same Query, Different Intent
- Different meanings
- Information about the astronomical/astrological
sign of cancer - information about cancer treatments
- Different intents
- is there any new tests for cancer?
- information about cancer treatments
11Same Intent, Different Evaluation
- Query Microsoft
- information about microsoft, the company
- Things related to the Microsoft corporation
- Information on Microsoft Corp
- 31/50 rated as not irrelevant
- Only 6/31 do more than one agree
- All three agree only for www.microsoft.com
- Inter-rater reliability 56
12Search Engines are for the Masses
Joe
Mary
13Much Room for Improvement
- Group ranking
- Best improves on Web by 38
- More people ? Less improvement
14Much Room for Improvement
- Group ranking
- Best improves on Web by 38
- More people ? Less improvement
- Personal ranking
- Best improves on Web by 55
- Remains constant
15Personalizing Web Search
- Motivation
- Algorithms
- Results
- Future Work
- Seesaw Search Engine
- See
- Seesaw
16Personalization Algorithms
- Related to relevance feedback
Query
Server
Document
Client
User
17Personalization Algorithms
- Related to relevance feedback
Query
Server
Document
Client
User
v. Result re-ranking
18Result Re-Ranking
- Ensures privacy
- Good evaluation framework
- Can look at rich user profile
- Look at light weight user models
- Collected on server side
- Sent as query expansion
19BM25
with Relevance Feedback
Score S tfi wi
N
ni
R
ri
N ni
wi log
20BM25
with Relevance Feedback
Score S tfi wi
N
ni
R
ri
(ri0.5)(N-ni-Rri0.5) (ni-ri0.5)(R-ri0.5)
wi log
21User Model as Relevance Feedback
Score S tfi wi
N
R
N NR
ni niri
ri
ni
(ri0.5)(N-ni-Rri0.5) (ni-ri0.5)(R-ri0.5)
wi log
22User Model as Relevance Feedback
Score S tfi wi
N
R
N NR ni niri
ri
ni
(ri0.5)(N-ni-Rri0.5) (ni- ri0.5)(R-ri0.5)
wi log
23User Model as Relevance Feedback
World
Score S tfi wi
N
User
R
ri
ni
24User Model as Relevance Feedback
World
Score S tfi wi
N
User
World related to query
R
ri
ni
ni
N
25User Model as Relevance Feedback
World
Score S tfi wi
N
User
World related to query
R
ri
ni
R
ni
User related to query
N
ri
Query Focused Matching
26User Model as Relevance Feedback
World Focused Matching
World
Score S tfi wi
N
User
Web related to query
R
ri
ni
R
ni
User related to query
N
ri
Query Focused Matching
27Parameters
- Matching
- User representation
- World representation
- Query expansion
28Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
29Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
30User Representation
- Stuff Ive Seen (SIS) index
- MSR research project Dumais, et al.
- Index of everything a users seen
- Recently indexed documents
- Web documents in SIS index
- Query history
- None
31Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
All SIS Recent SIS Web SIS Query history None
32Parameters
- Matching
- User representation
- World representation
- Query expansion
Query Focused World Focused
All SIS Recent SIS Web SIS Query History None
33World Representation
- Document Representation
- Full text
- Title and snippet
- Corpus Representation
- Web
- Result set title and snippet
- Result set full text
34Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
All SIS Recent SIS Web SIS Query history None
Full text Title and snippet
Web Result set full text Result set title and
snippet
35Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
All SIS Recent SIS Web SIS Query history None
Full text Title and snippet
Web Result set full text Result set title and
snippet
36Query Expansion
- All words in document
- Query focused
The American Cancer Society is dedicated to
eliminating cancer as a major health problem by
preventing cancer, saving lives, and diminishing
suffering through ...
The American Cancer Society is dedicated to
eliminating cancer as a major health problem by
preventing cancer, saving lives, and diminishing
suffering through ...
37Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
All SIS Recent SIS Web SIS Query history None
Full text Title and snippet
Web Result set full text Result set title and
snippet
All words Query focused
38Parameters
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
All SIS Recent SIS Web SIS Query history None
Full text Title and snippet
Web Result set full text Result set title and
snippet
All words Query focused
39Personalizing Web Search
- Motivation
- Algorithms
- Results
- Future Work
40Best Parameter Settings
- Matching
- User representation
- World representation
- Query expansion
Query focused World focused
Query focused World focused
Query focused
All SIS Recent SIS Web SIS Query history None
All SIS
Recent SIS
Web SIS
All SIS Recent SIS Web SIS Query history None
All SIS
Full text Title and snippet
Full text
Title and snippet
Web Result set full text Result set title and
snippet
Result set title and snippet
Web
Result set title and snippet
All words Query focused
All words
Query focused
41Seesaw Improves Retrieval
- No user model
- Random
- Relevance Feedback
- Seesaw
42Text Alone Not Enough
43Incorporate Non-text Features
44Summary
- Rich user model important for search
personalization - Seesaw improves text based retrieval
- Need other features
- to improve Web
- Lots of room
- for improvement
future
45Personalizing Web Search
- Motivation
- Algorithms
- Results
- Future Work
- Further exploration
- Making Seesaw practical
- User interface issues
46Further Exploration
- Explore larger parameter space
- Learn parameters
- Based on individual
- Based on query
- Based on results
- Give user control?
47Making Seesaw Practical
- Learn most about personalization by deploying a
system - Best algorithm reasonably efficient
- Merging server and client
- Query expansion
- Get more relevant results in the set to be
re-ranked - Design snippets for personalization
48User Interface Issues
- Make personalization transparent
- Give user control over personalization
- Slider between Web and personalized results
- Allows for background computation
- Creates problem with re-finding
- Results change as user model changes
- Thesis research ReSearch Engine
49Thank you!
50Search Engines are for the Masses
- Best common ranking
- DCG(i)
- Sort results by number marked highly relevant,
then by relevant - Measure distance with Kendall-Tau
- Web ranking more similar to common
- Individuals ranking distance 0.469
- Common ranking distance 0.445
Gain(i), if i 1 DCG(i1)
Gain(i)/log(i), otherwise