Title: Collaborative Filtering:
1Collaborative Filtering
Searching and Retrieving Web Information Together
Huimin Lu December 2, 2004 INF 385D Fall
2004 Instructor Don Turnbull
2Outline
- Introduction
- Collaborative Search Family
- Collaborative Filtering
- Systems
- Process
- Algorithm
- Problems Solutions
- Privacy
3Collaborative Search into IR World
- Inverted Index
- Yellow-pages-like information gateway
- Internet search engine (Sun,
1999) - Needs for collaborative retrieval
- Information-resources-focused systems
- - By CSCW structuring mechanisms
- recommendation
techniques - User-preferences-focused systems
4Collaborative Search Types
- Collaborative browsing
- Mediated searching
- Collaborative information filtering
- Collaborative agents
- - meda-search engines
- Collaborative re-use of results
- (Setten,
2000)
5Collaborative Filtering
- User-based filtering
- Collects the taste information from users
- who like to collaborate in the process of
searching and automatically predict or filter
the relevant information to users (Wikipedia,
2004). - Store profile preferences
- Build users database
- Recommended list by collaborative filter
6Collaborative Filtering Systems
- Commercial
- - Amazon
- - Barnes and Noble
- - Netflix
- Non-commercial
- - Moonranker
- - MovieLens
- - AmphetaRate
- - Audioscrobbler
- - Findory
- - Gnomoradio
- - iRATE radio
7System Example I Amazon.com Recommendation page
Back
8System Example II Moonranker.com ranking page
Back
9System Example I Movielens.com rating page
Back
10Collaborative Filtering Process
11Collaborative Filtering Algorithm
- Goal
- - Suggest new items/predict the utility
based on previous likings (Sarwar, 2001) - Memory-based
- - use entire user-item database
- - Pearson-correlation based approach, vector
similarity based approach, the extended
generalized vector space model - Model-based
- - develop a model of user rating
- - Bayesian network approach, the aspect model
12Problems and Solutions
- Memory-based algorithm problems
- - Sparsity insufficient user rating
information - - Scalability nearest neighbor algorithm
(compute user number and item number) - - Solution automatic weighting scheme by
MSU CMU - Model-based algorithm problem
- - Inherent static structure updating
problem learning exact cluster number and
specifying user classes problem - Systems problems
- - Scarcity less rating for some items
- - Early-rater no recommendations for new
items - - Solution collaborative information
filtering (communicating agents, correlating
profile, and filterbots - automated rating
robots)
13Privacy
- Unsafe server-based system
- Monopolies
- Peer-to-peer architecture
- - Multi-party computation
14Conclusion
The computer environment turns to be more
ubiquitous and pervasive. To meet IR users
needs, future collaborative filtering system
should be easily maintained with well-designed
algorithms and highly-protected user privacy.
15References
16References
17References
18References
19Questions or Comments?