Title: Images Labeling and its Applications
1Images Labeling and its Applications
- Presenter
- Adam J Bechle
- Ba-Quy Vuong
- Nan Chen
2Outline
- How to obtain images/object labels in a fun and
efficient way (Nan) - How to improve, process and organize labels
(Adams) - How to utilize the labels for search and
recognition (Bryan)
3Motivation
- Huge need for labeled image database
- Image search/filtering
- Object recognition
- Categorization
- Labels are not readily available
- Difficult to automatically label images
- Expensive and biased for manual labeling
4Tools for Data Collection
- ESP game
- Collect the labels related to the objects in the
image - Peekaboom game
- Select the image regions related to a given label
- Labelme web tool
- Select the region and name the label
5ESP Game
Score you earned in this play
Time left
History of your typing
Labels identified by others
Your guess
Skip to the next image
6Mechanism of ESP
- Players tend to use the object in the image as
guess to get credit - Labels are accepted if many independent pairs
agree on them - Strategies to avoid poisoned data
- Strategies to enhance usability and effectiveness
- http//www.gwap.com/gwap/gamesPreview/espgame/
7Evaluation Results
- 13, 630 people labeled 293,760 images with
1,271,451 labels in the first 4 months - 3.89 label/min by a pair of individuals
- Search precision is satisfactory
- Labels are consistent with the human decriptions
8Peekaboom Game
9features of Peekaboom
- Obtain tiny regions most distinct for a given
label - Can be used to get the bounding box
- Get pointer from game play data
10Evaluation of Peekaboom
- 14,153 people generated 1,122,998 pieces of data
in one month - The area of manually selected bounding box is
0.754 of the generated one - The accuracy of the pointers are 100 through
ping data
11Common/different features
12LabelMe web annotation tool
13Effectiveness of the labels
14Consensus of word description
15Complexity of the polygons
25, 50, 75 percentile of the complexity of
polygons for different object
16Location and size distribution
17LabelME Extending the Dataset
18Improving LabelMe Results
- Categorizing Labels
- Resolving Overlap Ambiguity
- Semi-Automatic Labeling
19Raw Labels
suv
car rear
van
automobile
taxi
car
car side
20Categorized Labels
Car
suv
car
car side
taxi
car rear
van
automobile
21Improving Labels with Categories
- WordNet Electronic dictionary
- Organize labels into category tree
22Label Category Results
23Resolving Overlap Ambiguity
- What does an overlapping polygon signify?
- Object parts
- Depth order
24Determining Object Parts
25Object Parts Results
26Depth Ordering Rules
- Background object (sky, road) ? Bottom
- Inside another object ? Top
- Most control points in overlap ? Top
27Depth Ordering Results
28Semi-Automatic Labeling
- Objective Detect and segment objects
- Query LableMe and search engine for images
- Train coarse detector with LabelMe
- Detect object in search engine images
- User verify detection
29Semi-Automatic Labeling Procedure
- Train classifier with LabelMe images/regions
30Semi-Automatic Labeling Procedure
Search engine image
Segment
Image
31Semi-Automatic Labeling Procedure
Combine segments to form bounding boxes
32Semi-Automatic Labeling Procedure
Classifier scores each bounding box
Max score object
33Semi-Automatic Labeling Results
- LabelMe performs better
- than search engine rank
34LabelMe Improvements
- Categorize labels with Wordnet
- Define object parts
- Determine object depth order
- Train semi-automatic labeler
35Object Recognition by Scene Allignment
- B. C. Russell, A. Torralba, C. Liu, R. Fergus, W.
T. Freeman - Advances in Neural Information Processing
Systems, 2007.
36Object Recognition by Scene Alignment
- Objective Use background of image to guide
object detection - If scene is similar, objects should be similar
- Improve object detection efficiency
37Matching Scenes Gist Feature
- Divide image into 4x4 grid
- Run Garbor filter on each cell
- 4 scales and 8 orientations
- Gives 512 dimensional score vector
- 4x4x4x8 512 dimensions
- Retrieve set of images with similar gist feature
score
38Gist Feature Results
39Object Detection
- Want to model the relationship between
- Object category, o
- Object spatial location in image, x
- Object appearance in image, g
- Model based on the product of
- Probability object category l appears in image, ?
- Likely spatial location of object category l, f
- Appearance likelihood of object category l, ?
40Gist Scenes
?
f
41Object Detection
42Object Detection - Clusters
- Organize training images into clusters, s
- Based on similar labels and object locations
- Similar clusters ? similar scene
p
43Object Detection - Clusters
44Object Detection Results
45Object Detection Failures
46Comparison with SVM
Log(True Pos.)
Log(False Positive)
47Object Recognition
- Train a classifier to locate objects based on
scene characteristics - Automatically annotate images based on similar
images
48Efficient Image Recognition Image Search
- Ba-Quy Vuong, Bryan
- Department of Computer Sciences
49Image recognition using millions of images
50Problem
- Given a query image, how to perform an efficient
recognition such that - Does the image contain a person?
- Where is the person in the image?
- Is the image a scene or a picture of an object?
51Motivation
- Using huge data to solve problems without
sophisticated algorithms - Did you mean in Google
- Higher chance to find similar images with a
bigger image database
52Image Size
- How to store and search a huge database
efficiently? - Solution Low dimensional image representation
(32x32) - Why 32x32?
- A threshold for a good recognition for human
53Data Collection
- Use Wordnet to extract non-abstract nouns -gt
75,062 nouns - Use 7 independent image search engines
Altavista, Ask, Flickr, Cydral, Google, Picsearch
and Webshots - Run over 8 months -gt 79,302,017 images after
cleaning - Reduce images sizes to 32x32 -gt 760 GB in total
54Image Similarity Metrics
- Sum of squared differences
- Incorporate invariances (translations, scaling,
image mirror) - Allow additional distortion
55Recognition
- High level idea Simple algorithm, let data do
the work for us - Wordnet voting scheme
- Reduce the Wordnet graph to a tree
- Recognition is performed at multiple levels
- Procedure
- From a query image, find neighbors
- Each neighbor votes for its branch
- Assign the label with the most votes
56Recognition
57Person Detection
- Goal Label an image as containing a person or
not - Collect votes from 80 nearest neighbors
- Use words related to person to evaluate
58Person Localization
- Crop the high resolution image
- Resize each crop and query the sibling set
- More votes for person class, more likely the crop
contains a person
59Scene Recognition
- Decide an image is a scene, not a picture of an
object - Count the number of votes accumulated at
location
60Summary
- Using a huge image database for recognition
- With lots of images, theres no need for
sophisticated algorithm - Using Wordnet to label images
- Many applications Person detection, person
localization, scene recognition, etc
61Image Search Small code representation
62Problem
- Given an input image
- How to retrieve similar images
- How to do that efficiently
- Naïve approach
- Compare the binary representations
- Too inefficient time, space
- Can we do better? -gt YES
63Global Image Representation
- High level idea using fixed length codes to
represent images - How?
- Decompose the image by multiscale oriented
filters 8 orientations 4 scales - Average the output magnitude over 4x4
non-overlapping windows - 4x8x16 512 dimensions
- Think it as using a single SIFT feature for the
entire image - Compact code but not really efficient
64Learning Binary Code
- What is the minimal number of bits?
- Learning problem
- Given a database of images xi, a distance
function D(i,j) - Learn a function yif(xi) that preserves nearest
neighbor relationships - Formally
- Let N100(xi), N100(yi) be the 100 nearest
neighbors of xi, yi respectively - Ideally N100(xi) N100(yi) for all i
65Learning Binary Code
- BoostSSC
- Restricted Boltzmann Machines
66Results
67Summary
- Use small codes to represent images
- Efficient time and space
68The presentation today
- Different systems for labeling images
- Techniques to further process and enrich the
labeled set - Applications of the labeled images
69Thank You