Title: Rank Aggregation Methods for the Web
1Rank Aggregation Methods for the Web
2Web Page Ranking Methods Reviewed
- PageRank global link analysis
- Indegree local link analysis
- HITS- topic-based link analysis
- Voting NNN and Correlation
- Graph distance from seed
- URL length and depth
- Text-based methods (e.g., tfidf)
3 Rank Aggregation
B D C A F E
Consensus ranking of all
B D C A
A B D C FE
B C D A F E
4Notations for Ranking
- Given a universe U, and ordered list t of a
subset of S of U - tx1 x2 xd , xi in S
- t(i) position of rank of i
- t number of elements
- full list t which contains all the elements in
U - partial list rank only some of elements in U
- top d list all d ranked elements are above all
unranked elements - Question when are two orderings similar? Can you
give a distance measure?
5Measuring Distance Between Orderings
- Spearmans Footrule Distance
- s , t two full list.
- s( i ) rank of candidate i
- Kendall tau distance
- Count the number of pairwise disagreements
between the two lists -
6Example of Ordered-List Distance
- Example
- S A,B,C,D,E
- s , t two full list
- Spearmans Footrule Distance
- F(s , t ) 1 2 1 0 2 6
- Kendall tau distance
- K(s , t ) (A,C), (B.D), (B,E), (D,E) 4
7Optimal ranking aggregation
- Optimality depends on the distance measure we
use. - Optimizing with Kendall tau distance, we obtain
Kemeny optimal aggregation - Can show satisfies neutrality and consistency
- important properties of rank aggregation
functions. - Useful but computationally hard. Kemeny optimal
aggregation is NP-hard. - Will show that footrule-optimal is in P.
8Two properties relate K and F
- For any full lists s,t
- K(s,t) F(s,t) 2 K(s,t)
- So we get a 2-approximation to Kemeny-optimality
- Since, if s is the Kemeny optimal aggregation of
full lists t1 ,, tk and s optimizes the
footrule aggregation then, - K(s, t1 ,, tk ) 2 K(s, t1 ,, tk )
9Condorcet Criteria and SPAM Filters
- Condorcet Criterion
- An element of S which wins every other in
pairwise simple majority voting should be ranked
first. - Extended Condorcet Criterion (XCC)
- If most voters prefer candidate a to candidate b
(i.e., of i s.t. ?i(a) lt ?i(b) is at least
n/2), then also ? should prefer a to b (i.e.,
?(a) lt ?(b)). - XCC is effective in spam-fighting and thus good
to use in meta-search.
10XCC Not always realizable
c b a
a a b
b c c
c b a
a c b
b a c
?(a) lt ?(b) lt ?(c)
Not realizable
11Voting Theory Desired Properties
- Given set of candidates and voter preferences
seek an algorithm that ranks candidates which
satisfies a set of desired properties - Which combination of properties are realizable?
- 1) Independence from Irrelevant Alternatives
- Relative order of a and b in ? should depend
only on relative order of a and b in ?1,,?n. - Ex if ?i (a b c) changes to (a c b), relative
order of a,b in ? should not change. -
12 Desired Properties
- 2) Neutrality
- No candidate should be favored to others.
- If two candidates switch positions in ?1,,?n,
they should switch positions also in ?. - 3) Anonymity
- No voter should be favored to others.
- If two voters switch their orderings, ? should
remain the same.
13Desired Properties
- 4) Monotonicity
- If the ranking of a candidate is improved by a
voter, its ranking in ? can only improve. - 5) Consistency
- If voters are split into two disjoint sets, S
and T, and both the aggregation of voters in S
and the aggregation of voters in T prefer a to b,
then also the aggregation of all voters should
prefer a to b.
14 Desired Properties
- 6) No Dictatorship f(?1,,?n) ! ?I
- 7) Unanimity (a.k.a. Pareto optimality)
- If all voters prefer candidate a to candidate b
(i.e., ?i(a) lt ?i(b) for all i), then also ?
should prefer a to b (i.e., ?(a) lt ?(b)).
15Desired Properties
- 8) Democracy satisfies extended Condorcet
Criterion XCC. - Always works for m 2.
- Not always realizable for m 3.
- Theorem May, 1952 For m 2, Democracy is the
only rank aggregation function which is monotone,
neutral, and anonymous.
16Arrows Impossibility Theorem Arrow, 1951
- Theorem If m 3, then the only rank aggregation
function that is unanimous and independent from
irrelevant alternatives is dictatorship. - Won Nobel prize (1972)
17Bordas method
- Easy and intuitive - Several score-basedvariants
1781 - Violates independence from irrelevant
alternatives
B(c)?iBi(c) Sorted in decreasing order
Bi(C8) 1 2 0
13
Bi(c)the number of candidates ranked below c in
? i
18Partial lists
- Handle partial lists by giving all the excess
scores equally among all unranked candidates,
Example Candidates number 100
Ranked candidates number 70 (score
31100) gtAssign score 31/30 to each 30 unranked
candidates
19Footrule optimal aggregation
- Footrule optimal aggregation can be computed in
polynomial time. is a good approximation of
Kemeny optimal aggregation. - Proof Via minimum cost perfect matching
20Markov Chain method for rank aggregation.
- Statescandidates
- Transitions depend on the preference orders given
by voters - Basic idea probabilistically switch to a
- better candidate
- Rank candidates based on stationary
probabilities! -
21Markov chain advantages
- Handling partial list and top d list by using
available comparisons to infer new ones - Handling uneven comparison and list length
- Computation efficiency
- O(NK) preprocessing,O(K) per step for
- about O(N) steps
22Four ways to build transition Matrix
- Current state is candidate a.
- MC1 Choose uniformly from multiset of all
candidates that were ranked at least as high as a
by some voter. - Probability to stay at a average rank
of a. - MC2 Choose a voter i uniformly at random and
pick uniformly at random from among the
candidates that the i-th voter ranked at least as
high as a. - MC3 Choose a voter i uniformly at random and
pick uniformly at random a candidate b. If i-th
voter ranked b higher than a, go to b. Otherwise,
stay in a. - MC4 Choose a candidate b uniformly at random If
most voters ranked b higher than a, go to b.
Otherwise, stay in a. - Rank of a of pairwise contests a
wins.
23A locally Kemeny optimal aggregation is a
relaxation of Kemeny Optimality
- A locally Kemeny optimal aggregation satisfies
the extended Condorcet property and can be
computed in kO(nlogn) worst case, O(n2) - Many of existing aggregation methods do not
satisfy ECC. - gtGiven t1 , ,tk use your favorite
aggregation
method to obtain a full list µ. And Apply local
kemenization to µ with respect to t1 , ,tk .
24Local Kemenization is a procedure to get locally
Kemeny optimal aggregation.
- A local Kemenization of a full list with
respect to Compute a
locally Kemeny optimal aggregation of
that is maximally consistent
with - This approach
- (1) preserves the strengths of the initial
aggregation . - (2) ranks non-spam above spam.
- (3) gives a result that disagrees with on
any pair ( i, j ) only if a majority of the ts
endorse this disagreement. - (4) for every d, 1 d µ , the restriction
of the output is a local Kemenization of the top
d elements of µ
25How do we perform local kemenization?
- Local Kemenization Example!
A B F E C D
B C A E F D
A C F D E B
B F D C A E
C A B F E D
B A DC E F
B
B A
A B
A B D
A B DC
A B CD
A B CF E D
disagree
AgtB 3 AltB 2
BgtD 4 BltD 1
26Experiments meta-search
K Kendall distance
SF scaled footrule distance IF induced
footrule distance LK Local
Kemenization