Using Background Knowledge to Improve Visual Learning - PowerPoint PPT Presentation

1 / 44

About This Presentation

Title:

Using Background Knowledge to Improve Visual Learning

Description:

Using Background Knowledge to Improve Visual Learning – PowerPoint PPT presentation

Number of Views:120

Avg rating:3.0/5.0

Slides: 45

Provided by: dere114

Category:

more less

Transcript and Presenter's Notes

Title: Using Background Knowledge to Improve Visual Learning

1
Using Background Knowledge to Improve Visual
Learning

Derek Hoiem
Beckman Directors Seminar
March 11, 2009

Work with Ali Farhadi, Ian Endres, Gang Wang,
Santosh Divvala, James Hays David Forsyth,
Alexei Efros, Martial Hebert
2
What Id like to make possible with computer
vision
Household Robot
Intelligent Vehicle
Security
Photo Organization
3
What we can do (with the right dataset)

Recognize faces
Categorize scenes
Detect, segment, and track objects
3D from multiple images or stereo
Classify actions

4
What we can do
BEACH
Detect and Localize Objects
Categorize Scenes
Face Detection and Recognition
5
But were a long way from Rosie

Computer vision has been divided into many task-
and dataset-specific problems
Difficult to coordinate pieces
Poor generalization to unfamiliar environments
Massive engineering and data collection effort
required for every task/dataset

6
Goal

Use background knowledge generalize known
solutions to new problems or dataset

7
The Challenge

How can we use what we know to make learning new
things easier and more robust?

8
This Talk

Three uses of background knowledge
Contextual knowledge
Compositional knowledge
Organizational knowledge

9
I. Contextual Knowledge

Goal Use knowledge of objects and spatial
layout to better detect a new object.

Work with Santosh Divvala, James Hays, Alexei
Efros, Martial Hebert
10
Object Detection without Context
Search over many positions and scales

11
Object Detection without Context
In each window is this a cat?
Cat?

Cat?
Cat?
12
Training a Detector
Classifier
Features
Examples
Color
Edges
Texture
13
Object Detection without Context
In each window is this a cat?
,
14
Object Detection without Context

Top five cat detections in a challenging dataset

Detector Felzenszwalb et al. CVPR 2008
Dataset PASCAL VOC 2008
15
What do we know that can help us?
16
What do we know that can help us?
Knowledge of Other Objects and Scenes
Similar Images
Large Set of Loosely Annotated Images
Associated Keywords
Helps tell us how likely the object is to appear
in this image.
Kitten
House
Baby
Puppy
Sand
17
What do we know that can help us?
Knowledge of Spatial Layout
Hoiem et al. 2005,2007
Surface Layout
Occlusion Boundaries
Depth Estimates
Helps tell us where and how big the object is
likely to be.
18
Context Likelihood of Presence

Object presence

Contains Cat
No Cat
19
Context Likelihood of Presence
Gist
Image
Surface Layout
Likely to contain a cat?
Associated Keywords
House
Kitten
Baby
Puppy
Sand
gist Torralba Oliva 2003
20
Context Likelihood of Position

Predict likelihood that object appears at each
position given surface layout and gist

21
Context Likelihood of Size

Predict height of object based on depth, surface
orientations, gist, and image position

Size from Gist Torralba Oliva 2003
22
Rescoring Candidate Objects
Independently Trained Classifiers
Appearance Score (from detector)
Presence Scores
Linear Weights L1-Regularized Logistic Regression
Bounding Box Score
Position Scores
Size Scores
23
Context improves detection
Top 5 Before Context
Top 5 After Context
24
Context improves detection accuracy
Average Precision (Higher is Better)
25
Context changes the error patterns

More confusion
Cats and Dogs
Dogs and Sheep
Motorbike and Bicycle
Less confusion
Objects and background

26
II. Compositional Knowledge

Goal Describe new objects using attributes
learned from other objects.

Work with Ali Farhadi, Ian Endres, David Forsyth
27
A name doesnt tell us much
Known Objects
New Object
Name Cat
Name Unknown
Name Dog
Name Horse
28
But what if we learn attributes?
Known Objects
New Object
Name Cat
Properties four legs, tail, eyes, ears, furry,
has stripes, gray
Name Unknown
Name Dog
Properties four legs, eyes, ears, snout, tan,
muscular
Name Horse
Properties four legs, tail, mane, eyes, ears,
snout, tan
29
We can infer what object is like
Known Objects
New Object
Name Cat
Properties four legs, tail, eyes, ears, furry,
has stripes, gray
Name Unknown
Name Dog
Properties four legs, eyes, snout, tan, muscular
Properties four legs, eyes, ears, snout,
stripes, mane
Name Horse
Properties four legs, tail, mane, eyes, ears,
snout, tan
30
Learning Attributes

Learn to distinguish between things that have an
attribute and things that dont
Train one classifier per attribute

31
Learning Correlated Attributes

Problem
Many attributes are strongly correlated through
the object category

Most cars are made of metal and have wheels
When we try to learn has wheels, we may
accidentally learn made of metal
Has Wheels, Made of Metal?
32
Decorrelating Attributes

Solution
Select features that can distinguish between two
classes
Things that have wheels
Things that do not, but have other attributes in
common

Vs.
No Wheels
Has Wheels
33
Learning to Describe Objects
34
Describing New Objects
35
Identifying Unusual Attributes
Absence of Typical Attributes
752 reports 68 are correct
Presence of Atypical Attributes
951 reports 47 are correct
36
Recognition from Description

Learn new classes by describing them to the
algorithm
Goat Is furry, four legged, has snout, has
horn
12-Class Classification Accuracy 32.5
Chance 8
As good as having 8 visual examples with original
image features

37
III. Organizational Knowledge

Goal Help a person organize his photos using
image similarity learned from Flickr groups.

Work with Gang Wang, David Forsyth
38
Taming the Digital Explosion

Photos are easy to take and store.
But its still difficult to organize them.

39
Solution Learn from photo sharing sites

Billions of images in Flickr
Hundreds of thousands of categories

40
Learn similarity

Downloads hundreds of groups, each containing
thousands of photos
Train classifier to predict whether a photo is
likely to belong in each group
Gang Wang created super-fast online training
method for kernelized SVMs
Images are similar if they are likely to belong
to the same group

41
We can find similar images
Retrieved Images Using Feature Similarity
Retrieved Images Using Similarity Learned from
Flickr
Query Image
42
We can say how two images are similar
Fireworks (15.6) Christmas (7.6) Rain (4.0) Water
drops (2.5) Candles (2.0)
Sports (2.6) Dances (2.0) Weddings (1.0) Toys
(0.5) Horses (0.5)
Painting (2.4) Art (1.2) Macro-flowers
(0.9) Hands (0.9) Skateboarding (0.6)
43
Conclusions

Background knowledge is a key missing component
in todays computer vision algorithms
Existing knowledge can make learning easier
Provides new abilities (say two things are
similar or different)
More complete visual models (better accuracy,
more reasonable mistakes)
Better able to handle new objects and situations
We need to start designing systems that
accumulate visual knowledge

44
Thank you
45
(No Transcript)

Write a Comment

User Comments (0)