Steve Hanneke

About This Presentation

Title:

Steve Hanneke

Description:

Train passive alg on all examples. Outline. Steve Hanneke 9. Formal model. Exciting New Results ... Question: What about infinite VC dimension? Question: Can we ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 25

Provided by: steveh63

Category:

more less

Transcript and Presenter's Notes

Title: Steve Hanneke

1
Activized Learning Transforming Passive to
Active with Improved Label Complexity

Steve Hanneke
Machine Learning Department
Carnegie Mellon University
shanneke_at_cs.cmu.edu

2
Passive Learning
Data Source
Learning Algorithm
Expert / Oracle
Raw Unlabeled Data
Labeled examples
Algorithm outputs a classifier
3
Active Learning
Data Source
Learning Algorithm
Expert / Oracle
Raw Unlabeled Data
Request for the label of an example
The label of that example
Request for the label of another example
The label of that example
. . .
Algorithm outputs a classifier
4
Active Learning
Data Source
Learning Algorithm
Expert / Oracle
Raw Unlabeled Data
How many label requests are required to
learn? Label Complexity
Request for the label of an example
The label of that example
Request for the label of another example
The label of that example
. . .
Algorithm outputs a classifier
e.g., Das04, Das05, DKM05, BBL06, Kaa06,
Han07ab, BBZ07, DHM07, BHW08
5
Activized Learning
Data Source
Activizer Meta-algorithm
Expert / Oracle
Raw Unlabeled Data
Request for the label of an example
The label of that example
Request for the label of another example
The label of that example
. . .
Dataset
Dataset
. . .
Classifier
Classifier
Algorithm outputs a classifier
Passive Learning Algorithm (Supervised
/ Semi-Supervised)
6
Activized Learning
Data Source
Activizer Meta-algorithm
Expert / Oracle
Raw Unlabeled Data
Request for the label of an example
The label of that example
Request for the label of another example
The label of that example
. . .
Dataset
Dataset
. . .
Classifier
Classifier
Algorithm outputs a classifier
Are there general-purpose activizers that
strictly improve the label complexity of any
passive algorithm?
Passive Learning Algorithm (Supervised
/ Semi-Supervised)
7
An Example Threshold Classifiers

A simple activizer for any threshold-learning
algorithm.

-
Steve Hanneke 7
8
An Example Threshold Classifiers

A simple activizer for any threshold-learning
algorithm.

Take n/2 unlabeled examples, request their labels
Locate the closest -/ points a,b
Estimate P(a,b), and sample ? n/(4P(a,b))
unlabeled examples
Request the labels in a,b
Label rest ourselves.
Train passive alg on all examples.
-

-
-

-
-
-

-
-
-
-
-
-
-
-

-
b
a
Used only n label requests, but get a classifier
trained on ?(n2) examples! Improvement in label
complexity over passive. (in this case, apply
idea sequentially to get exponential improvement)
Steve Hanneke 8
9
Outline

Formal model
Exciting New Results ?
Dealing with noise?
Conclusions open problems

10
Formal Model
11
Formal Model
Steve Hanneke 11
12
Naïve Approach
Produces a perfectly labeled data set, which we
can feed into any passive algorithm! So we get a
natural fallback guarantee.
But does it always improve over the passive
algorithm?
13
Naïve Approach
A more subtle example Intervals

-
-
1
0
14
Naïve Approach
A more subtle example Intervals
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
0
Suppose the target labels everything -1
Passive algorithm still trained with just O(n)
examples. No improvements. ?
15
A Simple Activizer
16
A Simple Activizer
Intervals revisited
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
0
x1
Again, suppose the target labels everything -1
Passive algorithm trained on ?(n2) samples.
Improved label complexity. ?
(can apply steps 0/1 and 5 sequentially, updating
V after every label request, for more savings)
17
Does This Activize Any Passive Algorithm?
Steve Hanneke 17
18
This Activizes Any Passive Algorithm!
HLW94 passive algorithm has O(1/?) sample
complexity.
Steve Hanneke 18
19
This Activizes Any Passive Algorithm!
Steve Hanneke 19
20
Efficiency?

Need to be able to test shatterability of a set
of d points, subject to consistency with a set
of O(n) labeled examples.
For some concept spaces, could be exponential in
d (or worse).
But in many cases, it may be efficient. (e.g.,
linear separators?)

21
Dealing with Noise
22
Dealing with Noise
1Technically, an additional slight modification
is needed to handle the case where the Bayes
optimal classifier is not in C. Details included
in a forthcoming paper.
23
Conclusions Open Questions

Can activize any passive learning algorithm
(in the zero-error, finite VC
dimension case)
Question What about infinite VC dimension?
Question Can we give more detailed bounds on ?a?
Question Can we always activize, even when there
is noise?

24
Thank You

Write a Comment

User Comments (0)