Images Labeling and its Applications - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

Images Labeling and its Applications

Description:

Images Labeling and its Applications – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 70
Provided by: nanc166
Category:

less

Transcript and Presenter's Notes

Title: Images Labeling and its Applications


1
Images Labeling and its Applications
  • Presenter
  • Adam J Bechle
  • Ba-Quy Vuong
  • Nan Chen

2
Outline
  • How to obtain images/object labels in a fun and
    efficient way (Nan)
  • How to improve, process and organize labels
    (Adams)
  • How to utilize the labels for search and
    recognition (Bryan)

3
Motivation
  • Huge need for labeled image database
  • Image search/filtering
  • Object recognition
  • Categorization
  • Labels are not readily available
  • Difficult to automatically label images
  • Expensive and biased for manual labeling

4
Tools for Data Collection
  • ESP game
  • Collect the labels related to the objects in the
    image
  • Peekaboom game
  • Select the image regions related to a given label
  • Labelme web tool
  • Select the region and name the label

5
ESP Game
Score you earned in this play
Time left
History of your typing
Labels identified by others
Your guess
Skip to the next image
6
Mechanism of ESP
  • Players tend to use the object in the image as
    guess to get credit
  • Labels are accepted if many independent pairs
    agree on them
  • Strategies to avoid poisoned data
  • Strategies to enhance usability and effectiveness
  • http//www.gwap.com/gwap/gamesPreview/espgame/

7
Evaluation Results
  • 13, 630 people labeled 293,760 images with
    1,271,451 labels in the first 4 months
  • 3.89 label/min by a pair of individuals
  • Search precision is satisfactory
  • Labels are consistent with the human decriptions

8
Peekaboom Game
9
features of Peekaboom
  • Obtain tiny regions most distinct for a given
    label
  • Can be used to get the bounding box
  • Get pointer from game play data

10
Evaluation of Peekaboom
  • 14,153 people generated 1,122,998 pieces of data
    in one month
  • The area of manually selected bounding box is
    0.754 of the generated one
  • The accuracy of the pointers are 100 through
    ping data

11
Common/different features
12
LabelMe web annotation tool
13
Effectiveness of the labels
14
Consensus of word description
15
Complexity of the polygons
25, 50, 75 percentile of the complexity of
polygons for different object
16
Location and size distribution
17
LabelME Extending the Dataset
18
Improving LabelMe Results
  • Categorizing Labels
  • Resolving Overlap Ambiguity
  • Semi-Automatic Labeling

19
Raw Labels
suv
car rear
van
automobile
taxi
car
car side
20
Categorized Labels
Car
suv
car
car side
taxi
car rear
van
automobile
21
Improving Labels with Categories
  • WordNet Electronic dictionary
  • Organize labels into category tree

22
Label Category Results
23
Resolving Overlap Ambiguity
  • What does an overlapping polygon signify?
  • Object parts
  • Depth order

24
Determining Object Parts
25
Object Parts Results
26
Depth Ordering Rules
  • Background object (sky, road) ? Bottom
  • Inside another object ? Top
  • Most control points in overlap ? Top

27
Depth Ordering Results
28
Semi-Automatic Labeling
  • Objective Detect and segment objects
  • Query LableMe and search engine for images
  • Train coarse detector with LabelMe
  • Detect object in search engine images
  • User verify detection

29
Semi-Automatic Labeling Procedure
  • Train classifier with LabelMe images/regions

30
Semi-Automatic Labeling Procedure
Search engine image
Segment
Image
31
Semi-Automatic Labeling Procedure
Combine segments to form bounding boxes
32
Semi-Automatic Labeling Procedure
Classifier scores each bounding box
Max score object
33
Semi-Automatic Labeling Results
  • LabelMe performs better
  • than search engine rank

34
LabelMe Improvements
  • Categorize labels with Wordnet
  • Define object parts
  • Determine object depth order
  • Train semi-automatic labeler

35
Object Recognition by Scene Allignment
  • B. C. Russell, A. Torralba, C. Liu, R. Fergus, W.
    T. Freeman
  • Advances in Neural Information Processing
    Systems, 2007.

36
Object Recognition by Scene Alignment
  • Objective Use background of image to guide
    object detection
  • If scene is similar, objects should be similar
  • Improve object detection efficiency

37
Matching Scenes Gist Feature
  • Divide image into 4x4 grid
  • Run Garbor filter on each cell
  • 4 scales and 8 orientations
  • Gives 512 dimensional score vector
  • 4x4x4x8 512 dimensions
  • Retrieve set of images with similar gist feature
    score

38
Gist Feature Results
39
Object Detection
  • Want to model the relationship between
  • Object category, o
  • Object spatial location in image, x
  • Object appearance in image, g
  • Model based on the product of
  • Probability object category l appears in image, ?
  • Likely spatial location of object category l, f
  • Appearance likelihood of object category l, ?

40
Gist Scenes
?
f
41
Object Detection
42
Object Detection - Clusters
  • Organize training images into clusters, s
  • Based on similar labels and object locations
  • Similar clusters ? similar scene

p
43
Object Detection - Clusters
44
Object Detection Results
45
Object Detection Failures
46
Comparison with SVM
Log(True Pos.)
Log(False Positive)
47
Object Recognition
  • Train a classifier to locate objects based on
    scene characteristics
  • Automatically annotate images based on similar
    images

48
Efficient Image Recognition Image Search
  • Ba-Quy Vuong, Bryan
  • Department of Computer Sciences

49
Image recognition using millions of images
50
Problem
  • Given a query image, how to perform an efficient
    recognition such that
  • Does the image contain a person?
  • Where is the person in the image?
  • Is the image a scene or a picture of an object?

51
Motivation
  • Using huge data to solve problems without
    sophisticated algorithms
  • Did you mean in Google
  • Higher chance to find similar images with a
    bigger image database

52
Image Size
  • How to store and search a huge database
    efficiently?
  • Solution Low dimensional image representation
    (32x32)
  • Why 32x32?
  • A threshold for a good recognition for human

53
Data Collection
  • Use Wordnet to extract non-abstract nouns -gt
    75,062 nouns
  • Use 7 independent image search engines
    Altavista, Ask, Flickr, Cydral, Google, Picsearch
    and Webshots
  • Run over 8 months -gt 79,302,017 images after
    cleaning
  • Reduce images sizes to 32x32 -gt 760 GB in total

54
Image Similarity Metrics
  • Sum of squared differences
  • Incorporate invariances (translations, scaling,
    image mirror)
  • Allow additional distortion

55
Recognition
  • High level idea Simple algorithm, let data do
    the work for us
  • Wordnet voting scheme
  • Reduce the Wordnet graph to a tree
  • Recognition is performed at multiple levels
  • Procedure
  • From a query image, find neighbors
  • Each neighbor votes for its branch
  • Assign the label with the most votes

56
Recognition
57
Person Detection
  • Goal Label an image as containing a person or
    not
  • Collect votes from 80 nearest neighbors
  • Use words related to person to evaluate

58
Person Localization
  • Crop the high resolution image
  • Resize each crop and query the sibling set
  • More votes for person class, more likely the crop
    contains a person

59
Scene Recognition
  • Decide an image is a scene, not a picture of an
    object
  • Count the number of votes accumulated at
    location

60
Summary
  • Using a huge image database for recognition
  • With lots of images, theres no need for
    sophisticated algorithm
  • Using Wordnet to label images
  • Many applications Person detection, person
    localization, scene recognition, etc

61
Image Search Small code representation
62
Problem
  • Given an input image
  • How to retrieve similar images
  • How to do that efficiently
  • Naïve approach
  • Compare the binary representations
  • Too inefficient time, space
  • Can we do better? -gt YES

63
Global Image Representation
  • High level idea using fixed length codes to
    represent images
  • How?
  • Decompose the image by multiscale oriented
    filters 8 orientations 4 scales
  • Average the output magnitude over 4x4
    non-overlapping windows
  • 4x8x16 512 dimensions
  • Think it as using a single SIFT feature for the
    entire image
  • Compact code but not really efficient

64
Learning Binary Code
  • What is the minimal number of bits?
  • Learning problem
  • Given a database of images xi, a distance
    function D(i,j)
  • Learn a function yif(xi) that preserves nearest
    neighbor relationships
  • Formally
  • Let N100(xi), N100(yi) be the 100 nearest
    neighbors of xi, yi respectively
  • Ideally N100(xi) N100(yi) for all i

65
Learning Binary Code
  • BoostSSC
  • Restricted Boltzmann Machines

66
Results
67
Summary
  • Use small codes to represent images
  • Efficient time and space

68
The presentation today
  • Different systems for labeling images
  • Techniques to further process and enrich the
    labeled set
  • Applications of the labeled images

69
Thank You
Write a Comment
User Comments (0)
About PowerShow.com