Title: Strategy-Proof%20Classification
1Strategy-Proof Classification
- Reshef Meir
- School of Computer Science and Engineering,
Hebrew University
A joint work with Ariel. D. Procaccia and
Jeffrey S. Rosenschein
2Strategy-Proof Classification
- An Example of Strategic Labels in Classification
- Motivation
- Our Model
- Previous work (positive results)
- An impossibility theorem
- More results (if there is time)
(12 minutes)
3Motivation
Model
Results
Introduction
Strategic labeling an example
ERM
5 errors
4Motivation
Model
Results
Introduction
5Motivation
Model
Results
Introduction
If I will only change the labels
24 6 errors
6Classification
Motivation
Results
Introduction
Model
- The Supervised Classification problem
- Input a set of labeled data points
(xi,yi)i1..m - output a classifier c from some predefined
concept class C ( functions of the form f
X?-, ) - We usually want c to classify correctly not just
the sample, but to generalize well, i.e .to
minimize - R(c)
- the expected number of errors w.r.t. the
distribution D
E(x,y)D c(x)?y
7Classification (cont.)
Motivation
Results
Introduction
Model
- A common approach is to return the ERM, i.e. the
concept in C that is the best w.r.t. the given
samples (has the lowest number of errors) - Generalizes well under some assumptions on the
concept class C - With multiple experts, we cant trust our ERM!
8Where do we find experts with incentives?
Introduction
Model
Results
Motivation
- Example 1 A firm learning purchase patterns
- Information gathered from local retailers
- The resulting policy affects them
- the best policy, is the policy that fits my
pattern
9Introduction
Model
Results
Motivation
Example 2 Internet polls / expert systems
Users
Reported Dataset
Classification Algorithm
Classifier
10Related work
Introduction
Motivation
Model
Results
- A study of SP mechanisms in Regression learning
- O. Dekel, F. Fischer and A. D. Procaccia,
Incentive Compatible Regression Learning, SODA
2008 - No SP mechanisms for Clustering
- J. Perote-Peña and J. Perote. The impossibility
of strategy-proof clustering, Economics Bulletin,
2003
11A problem instance is defined by
Introduction
Motivation
Results
Model
- Set of agents I 1,...,n
- A partial dataset for each agent i ? I,
- Xi xi1,...,xi,m(i) ? X
- For each xik?Xi agent i has a label yik??,?
- Each pair sik?xik,yik? is an example
- All examples of a single agent compose the
labeled dataset Si si1,...,si,m(i) - The joint dataset S ?S1 , S2 ,, Sn? is our
input - mS
- We denote the dataset with the reported labels by
S
12Input Example
Introduction
Motivation
Results
Model
X1 ? Xm1
X2 ? Xm2
X3 ? Xm3
Y1 ? -,m1
Y2 ? -,m2
Y3 ? -,m3
S ?S1, S2,, Sn? ?(X1,Y1),, (Xn,Yn)?
13Incentives and Mechanisms
Introduction
Motivation
Results
Model
- A Mechanism M receives a labeled dataset S and
outputs c ? C - Private risk of i Ri(c,S) k c(xik) ? yik
/ mi - Global risk R(c,S) i,k c(xik) ? yik / m
- We allow non-deterministic mechanisms
- The outcome is a random variable
- Measure the expected risk
14ERM
Introduction
Motivation
Results
Model
- We compare the outcome of M to the ERM
- c ERM(S) argmin(R(c),S)
- r R(c,S)
c ? C
Can our mechanism simply compute and return the
ERM?
15Requirements
Introduction
Motivation
Results
Model
- Good approximation
- ?S R(M(S),S) ßr
- Strategy-Proofness (SP)
- ?i,S,Si Ri(M(S-i , Si),S) Ri(M(S),S)
- ERM(S) is 1-approximating but not SP
- ERM(S1) is SP but gives bad approximation
Are there any mechanisms that guarantee both SP
and good approximation?
16Restricted settings
Introduction
Motivation
Model
Results
- A very small concept class C 2
- There is a deterministic SP mechanism that
obtains a 3-approximation ratio - This bound is tight
- Randomization can improve the bound to 2
R. Meir, A. D. Procaccia and J. S. Rosenschein,
Incentive Compatible Classification under
Constant Hypotheses A Tale of Two Functions,
AAAI 2008
17Restricted settings (cont.)
Introduction
Motivation
Model
Results
- Agents with similar interests
- There is a randomized SP 3-approximation
mechanism (works for any class C)
R. Meir, A. D. Procaccia and J. S. Rosenschein,
Incentive Compatible Classification with Shared
Inputs, IJCAI 2009.
18But not everything shines ?
Introduction
Motivation
Model
Results
- Without restrictions on the input, we cannot
guarantee a constant approximation ratio - Our main result
- Theorem There is a concept class C, for which
there are no deterministic SP mechanisms with
o(m)-approximation ratio
19Deterministic lower bound
Introduction
Motivation
Model
Results
- Proof idea
- First construct a classification problem that is
equivalent to a voting problem with 3 candidates - Then use the Gibbard-Satterthwaite theorem to
prove that there must be a dictator - Finally, the dictators opinion might be very far
from the optimal classification
20Proof (1)
Introduction
Motivation
Model
Results
- Construction
- We have Xa,b, and 3 classifiers as follows
- The dataset contains two types of agents, with
samples distributed unevenly over a and b
We do not set the labels. Instead, we denote by Y
all the possible labelings of an agents dataset.
21Proof (2)
Introduction
Motivation
Model
Results
- Let P be the set of all 6 orders over C
- A voting rule is a function of the form f Pn ? C
- But our mechanism is a function M Yn ? C !
- (its input are labels and not orders)
- Lemma 1 there is a valid mapping g Pn ? Yn,
s.t. (Mg) is a voting rule
22Proof (3)
Introduction
Motivation
Model
Results
- Lemma 2 If M is SP, and guarantees any bounded
approximation ratio, then fMg is dictatorial - Proof (f is onto) any profile that c classifies
perfectly must induce the selection of c - (f is SP) suppose there is a manipulation
- By mapping this profile to labels with g, we find
a manipulation of M, in contradiction to its SP - From the G-S theorem, f must be dictatorial
23Proof (4)
Introduction
Motivation
Model
Results
- Finally, f (and thus M) can only be dictatorial.
- We assume w.l.o.g. that the dictator is agent 1
of type Ia. We now label the data points as
follows - The optimal classifier is cab, which makes 2
errors - The dictator selects ca, which makes m/2 errors
24Real concept classes
Introduction
Motivation
Model
Results
- We managed to show that there are no good
(deterministic) SP mechanisms, but only for a
synthetically constructed class. - We are interested in more common classes, that
are really used in machine learning. For example - Linear Classifiers
- Boolean Conjunctions
25Linear classifiers
Introduction
Motivation
Model
Results
a
b
ca
cb
cab
26A lower bound for randomized SP mechanisms
Introduction
Motivation
Model
Results
- A lottery over dictatorships is still bad
- ?(k) instead of ?(m), where k is the size of the
largest dataset controlled by an agent ( m kn
) - However, it is not clear how to eliminate other
mechanisms - G-S works only for deterministic mechanisms
- Another theorem by Gibbard 79 can help
- But only under additional assumptions
27Upper bounds
Introduction
Motivation
Model
Results
- So, our lower bounds do not leave much hope for
good SP mechanisms - We would still like to know if they are tight
- A deterministic SP O(m)-approximation is easy
- break ties iteratively according to dictators
- What about randomized SP O(k) mechanisms?
28The iterative random dictator (IRD)
Introduction
Motivation
Model
Results
- (example with linear classifiers on R1)
v
v
29The iterative random dictator (IRD)
Introduction
Motivation
Model
Results
- (example with linear classifiers on R1)
v
v
Iteration 1 2 errors
30The iterative random dictator (IRD)
Introduction
Motivation
Model
Results
- (example with linear classifiers on R1)
v
v
Iteration 1 2 errors
Iteration 2 5 errors
Iteration 3 0 errors
31The iterative random dictator (IRD)
Introduction
Motivation
Model
Results
- (example with linear classifiers on R1)
v
v
Iteration 1 2 errors
Iteration 2 5 errors
Iteration 3 0 errors
Iteration 4 0 errors
32The iterative random dictator (IRD)
Introduction
Motivation
Model
Results
- (example with linear classifiers on R1)
v
v
Iteration 1 2 errors
Theorem The IRD is O(k2) approximating
for Linear Classifiers in R1
Iteration 2 5 errors
Iteration 3 0 errors
Iteration 4 0 errors
Iteration 5 1 error
33Future work
Introduction
Motivation
Model
Results
- Other concept classes
- Other loss functions
- Alternative assumptions on structure of data
- Other models of strategic behavior
-
34Thank you...