Title: An opposition to WindowScanning Approaches in Computer Vision
1An opposition to Window-Scanning Approaches in
Computer Vision
- Presented by Tomasz Malisiewicz
- March 6, 2006
- Advanced Perception _at_ The Robotics Institute
2(No Transcript)
32 Problems
- Does scanning windows across an image work?
- What types of objects does it work for?
4What are window-scanning approaches missing?
5Quick Question What is this?
6What is context?
- Any data or meta-data not directly produced by
the presence of an object - Nearby image data
7What is context?
- Any data or meta-data not directly produced by
the presence of an object - Nearby image data
- Scene information
Context
Context
8What is context?
- Any data or meta-data not directly produced by
the presence of an object - Nearby image data
- Scene information
- Presence, locations of other objects
Tree
9Clues for Function
10Clues for Function
- What is this?
- Now can you tell?
11Low-Res Scenes
12Low-Res Scenes
- What is this?
- Now can you tell?
13More Low-Res
14More Low-Res
15Why is context useful?
- Objects defined at least partially by function
- Trees grow in ground
- Birds can fly (usually)
- Door knobs help open doors
16Why is context useful?
- Objects defined at least partially by function
- Context gives clues about function
- Not rooted into the ground ? not tree
- Object in sky ? cloud, bird, UFO, plane,
superman - Door knobs always on doors
17Why is context useful?
- Objects defined at least partially by function
- Context gives clues about function
- Objects like some scenes better than others
- Toilets like bathrooms
- Fish like water
18Why is context useful?
- Objects defined at least partially by function
- Context gives clues about function
- Objects like some scenes better than others
- Many objects are used together and, thus, often
appear together - Kettle and stove
- Keyboard and monitor
19The other problem
- What types of objects does it work for?
Assuming we can just directly avoid the first
problem
20- Our goal is to develop a system that detects and
recognizes many kinds of objects in photographs
and video including everyday office objects, text
captions in video, and various structures in
biomedical imagery. Schneiderman and Kanade
from Object Detection Using the Statistics of
Parts
However, such approaches seem unlikely to scale
up to the detection of hundreds or thousands of
different object classes because each classifier
is trained and run independently. Torralba
and Murphy and Freeman from Sharing features
efficient boosting procedures for multiclass
object detection
How many different classifiers must one
construct? A different classifier for each
object? A different classifier for each pose of
an object? How many poses do we need per object?
21Too many windows
- Now imagine scanning a window and applying 100K
independent classifiers at each window
22Conclusion
- Without context, we cant find all things we want
to find. We need context to help constrain the
search for objects. - With independent classifiers per object (and per
pose), we cant detect a large number of objects.
Should cow detectors and a horse detectors be
built independently? Think along the lines of a
horse and a cow are types of animals that often
occur in similar contexts. - Remember that complex and deformable objects
would require many poses if are to adhere to the
window-based classifier paradigm.
23Thank you.
Pascal 2006 Visual Challenge Image