Title: Finetuning Ranking Models:
1Fine-tuning Ranking Models
- a two-step optimization approach
Vitor Jan 29, 2008 Text Learning Meeting - CMU
With invaluable ideas from .
2Motivation
- Rank, Rank, Rank
- Web retrieval, movie recommendation, NFL draft,
etc. - Einats contextual search
- Richards set expansion (SEAL)
- Andys context sensitive spelling correction
algorithm - Selecting seeds in Franks political blog
classification algorithm - Ramnaths thunderbird extension for
- Email Leak prediction
- Email Recipient suggestion
3Help your brothers!
- Try Cut Once!, our Thunderbird extension
- Works well with Gmail accounts
- Its working reasonably well
- We need feedback.
4Thunderbird plug-in
Leak warnings hit x to remove recipient
Suggestions hit to add
Pause or cancel send of message
Email Recipient Recommendation
Timer msg is sent after 10sec by default
Classifier/rankers written in JavaScript
5Email Recipient Recommendation
36 Enron users
6Email Recipient Recommendation
Threaded
Carvalho Cohen, ECIR-08
7Aggregating Rankings
Aslam Montague, 2001 Ogilvie Callan,
2003 Macdonald Ounis, 2006
- Many Data Fusion methods
- 2 types
- Normalized scores CombSUM, CombMNZ, etc.
- Unnormalized scores BordaCount, Reciprocal Rank
Sum, etc. - Reciprocal Rank
- The sum of the inverse of the rank of document in
each ranking.
8Aggregated Ranking Results
Carvalho Cohen, ECIR-08
9Intelligent Email Auto-completion
TOCCBCC
CCBCC
10Carvalho Cohen, ECIR-08
11Can we do better?
- Not using other features, but better ranking
methods - Machine learning to improve ranking Learning to
rank - Many (recent) methods
- ListNet, Perceptrons, RankSvm, RankBoost,
AdaRank, Genetic Programming, Ordinal Regression,
etc. - Mostly supervised
- Generally small training sets
- Workshop in SIGIR-07 (Einat was in the PC)
12Pairwise-based Ranking
Goal induce a ranking function f(d) s.t.
Rank q
d1 d2 d3 d4 d5 d6 ... dT
We assume a linear function f
Therefore, constraints are
13Ranking with Perceptrons
- Nice convergence properties and mistake bounds
- bound on the number of mistakes/misranks
- Fast and scalable
- Many variants Collins 2002, Gao et al
2005, Elsas et al 2008 - Voting, averaging, committee, pocket, etc.
- General update rule
- Here Averaged version of perceptron
14Rank SVM
Joachims, KDD-02, Herbrich et al, 2000
Equivalent to
- Equivalent to maximing AUC
15Loss Function
16Loss Function
17Loss Function
18Loss Functions
Not convex
19Fine-tuning Ranking Models
Base ranking model
Final model
Base Ranker
Sigmoid Rank
e.g., RankSVM, Perceptron, etc.
Non-convex Minimizing a very close
approximation for the number of misranks
20Gradient Descent
21Results in CC prediction
36 Enron users
22Set Expansion (SEAL) Results
Wang Cohen, ICDM-2007
Listnet Cao et al. , ICML-07
23Results in Letor
24Results in Letor
25Learning Curve
TOCCBCC Enron user lokay-m
26Learning Curve
CCBCC Enron user campbel-m
27Learning Curve
CCBCC Enron user campbel-m
28Regularization Parameter
TREC3
TREC4
Ohsumed
s2
29Some Ideas
- Instead of number of misranks, optimize other
loss functions - Mean Average Precision, MRR, etc.
- Rank Term
- Some preliminary results with Sigmoid-MAP
- Does it work for classification?
30Thanks