Title: Bayesian Multiple Instance Learning for Object Recognition
1Bayesian Multiple Instance Learning for Object
Recognition
- Nando de Freitas
- Peter Carbonetto Gyuri Dorko Hendrik Kueck
Cordelia Schmid
University of British Columbia
2 Probabilistic annotation
Features (segments/local)
P(clion features)
Input image
0.8
0.05
0.55
0.3
0.22
0.25
3 Probabilistic annotation
Features (NCUT segments/local)
P(clion features)
Input image
0.8
0.05
0.55
0.3
0.22
0.25
0.9
Output probabilities are required to make
decisions under uncertainty (e.g. active
learning). Individual probabilistic classifiers
can be combined to account for context.
4 Training data images with captions
- Problem associations are unknown.
- Typical solution mixture models (and EM). For
example, Duygulu et al Fergus et al. - These models are difficult to train in large
domains, regardless of learning technique.
Parameter distributions in mixture models have a
factorial number of modes.
5 Hard multiple instance learning (MIL)
boat
boat
no boat
no boat
no boat
- There are constraints on the unknown labels.
6 Sparse probit kernel classification
We use a sparse kernel machine to classify the
features. The classification depends on the
feature x and its relation to other prototype
features x .
i
The outputs of the classifier are mapped to class
probabilities using a probit model.
7 Naïve graphical model
- We introduce a hierarchical Bayesian prior to
ensure robustness with respect to parameter
settings and to avoid over-fitting. - Problem Beta is a large dimensional,
correlated Gaussian variable.
8 Data augmentation and Rao-Blackwellisation
Given the one dimensional uncorrelated Gaussian
variables Z, the posterior for Beta can be
computed analytically. No need for approximation
in high dimensions!
9 Hard MIL
Given nd images with ng features per image, we
either have labels for the features in an image,
or hard constraints C indicating that at least
one feature has positive label and one has
negative label.
10 Hard MIL
This is a hard non-convex (disjunctive)
optimization problem. Existing MIL techniques
perform poorly because of this. MCMC can handle
it, but we can do better by introducing a new
model for MIL.
11 New model soft MIL
We introduce a probabilistic (Beta) measurement
model that incorporates the expected number of
positives and negatives.
12 Soft MIL
Easier, smooth optimization problem. We can now
develop and apply MAP-EM optimization,
variational techniques and LeNets.
13 Synthetic example
14 Synthetic example
15 The Bare Results
16 ROC Curves
MCMC
EM
17ROC Curves for test set
- Standard deviation across 20 runs for sky
18 Local features
(Carbonetto, Dorko, Schmid de Freitas)
19 Preliminary results
20 Finding cars and arctic pooches
21 How much supervision is needed ?
22 Active learning What question should be asked?
23 Active learning
24 Active learning
25 Active learning
26 Active learning
27 Active learning
28 Active learning
29 Conclusions and other work
- Can scale to thousands of object categories. Fast
training using Rao-Blackwellisation and dual tree
recursions. - Background is modeled using the same relational
kernel structure as the foreground. - Requires little supervision and lends itself to
active learning. - Can combine many different features (statistics,
detectors). Some features are shared as we have
multiple labels per image. - Can incorporate context either with random fields
(Carbonetto et al) to combine all classifiers or
use the method of Antonio Torralba et al.
30Thanks !
31 Synthetic examples
32 Synthetic examples
33Spatial context models
34Spatial context models