SceneBased Vision Localization - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

SceneBased Vision Localization

Description:

Looking for static and reliable landmarks. View-point invariance and measurability ... Filter bank features works better than color and monochrome histograms. ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 39
Provided by: siag
Category:

less

Transcript and Presenter's Notes

Title: SceneBased Vision Localization


1
Scene-Based Vision Localization
  • Christian Siagian

2
Outline
  • Localization as Landmark Detection
  • Scenes as Landmarks
  • Scene-based localization system 1
  • Scene-based localization system 2
  • Overall discussion

3
Mobile Robot Localization
  • Localization Landmark Recognition
  • Characteristics of good landmarks (and the
    recognition process)
  • Uniqueness
  • Scalability in identifying large sets of unique
    landmarks.
  • Permanency
  • Looking for static and reliable landmarks
  • View-point invariance and measurability
  • Need accurate estimation for observation model
  • Fast and efficient computation
  • Real time decision making constraint

4
Detecting Scenes as Landmarks
  • Treating scene as a whole to obtain global
    features, no need for brute force search.
  • Obtaining the gist of the scene.
  • Bypassing segmentation and grouping steps.
  • Not as susceptible to dynamic changes.
  • Background as source of information, foreground
    as source of noise/distraction.
  • Ideal with peripheral (wide angle) vision system
    because foreground area decreases in area
    percentage.

5
Gist
  • Definition
  • Essence, holistic characteristics of an image
  • Context information obtained within a eye saccade
    (app. 150 ms.)
  • Evidence of place recognizing cells at
    Parahippocampal Place Area (PPA).
  • No biologically plausible models of Gist yet
  • Tasks that has been shown to use gist
  • Scene categorization/context recognition
  • Region priming, layout recognition

6
Scene-Based Recognition
  • Advantages
  • Using more stable features, no need to rely on
    permanency of objects chosen
  • Foreground noise is averaged out.
  • Scale and rotational invariance
  • Can add coarse layout information
  • Disadvantages
  • Illumination normalization is a must
  • Lost of detailed information for localization
    resolution may result in lost of feature
    expressiveness.
  • Formulation of observation model

7
Example of Approaches
  • Color Histogram of Omni-view image Ulrich and
    Nourbakhsh 2002
  • Wavelet transform of grid of sub-regions
    Torallba 2003
  • Fourier Transform of grid of sub-regions Oliva
    and Torralba 2001
  • Histogram of learned prime-textures Renniger and
    Malik 2004

8
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Exploit visual context (low dimensional
    representation of image) gist of the image
  • Argue that color is less constraining than
    textural properties of an image and their spatial
    layout
  • Coarse layout of scenes is included
  • Lends itself to topological map
  • Platform is a wearable camera, with the user run
    through campus to obtain training and testing
    data.

9
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Representation
  • Input image is low resolution 160 x 120 blurred
    and low contrast image without normalization.
  • Wavelet image decomposition of 6 orientation x 4
    scales x (4x4) grid sub-region 384 features
  • Reduced using PCA to 80 features -gt a lot of
    redundancies

10
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Observation model
  • Each place is modeled as a set of K spherical
    Gaussian of features taken from trial runs.

11
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Localization/recognition framework
  • Hidden Markov Model (HMM)
  • where A(q,q) is a transition matrix (a map),
    obtained from trial-runs by counting the number
    of transitions to and from each location.
  • Transition Matrix is further smoothed with
    Dirichlet smoothing

12
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Overall model

13
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Results and discussion
  • Recognize 63 different locations at gt70
  • Recognize novel places under place categories
  • Recognize indoor vs. outdoors.

14
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Trial run for familiar locations
  • Top. The solid line represents the true location,
    and the dots represent the posterior probability
    associated with each location where shading
    intensity is proportional to probability.
  • Middle. Estimated category of each location
  • Bottom. Estimated probability of being indoors or
    outdoors.

15
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Trial run for unfamiliar locations (t 1-1500)
  • Place recognition system has low confidence
    everywhere
  • Place categorization system is still able to
    classify offices, corridors and conference rooms.
  • After returning to a known environment (after t
    1500) performance returns to normal

16
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • HMM improved performance from 50 to 70.
  • Filter bank features works better than color and
    monochrome histograms. Note they may not be
    normalized.

17
System 2 Ulrich and Nourbakhsh - 2002
  • Inspired by image retrieval techniques, sees
    scene features as reduction of storage.
  • Takes advantage of colors invariance to
    orientation and diagnosticity of color. Oliva
  • Also lends itself to topological map
  • Platform is a passive robot pulled around the
    campus.

18
System 2 Ulrich and Nourbakhsh - 2002
  • Representation
  • Panoramic color omni-camera simulation of
    peripheral vision. Although the level of
    distortion is high which renders edge-based
    histogram difficult

19
System 2 Ulrich and Nourbakhsh - 2002
  • Representation, continued.
  • Calculate HSV and RGB/normalized RGB values 6
    channels
  • Build 6 one dimensional histograms
  • All histograms are low-pass filtered with an
    average kernel

20
System 2 Ulrich and Nourbakhsh - 2002
  • Observation Model
  • Use bin-by-bin Jeffrey Divergence similarity
    measures as comparison
  • Each of the 6 color bands vote for location with
    the minimum distance
  • Calculate the confidence of the vote
  • Only produce an answer if unanimous and above
    confidence threshold vote is reached.

21
System 2 Ulrich and Nourbakhsh - 2002
  • Localization/recognition framework
  • Training run-through to collect images from each
    location
  • User also create adjacency map to indicate
    topological relationship between locations
  • Images scene representations are stored to be
    compared with incoming features
  • Speed up computation when only check at previous
    location and its neighbors -gt need initial
    location and cant deal with kidnapped robot case.

22
System 2 Ulrich and Nourbakhsh - 2002
  • Example adjacency map

23
System 2 Ulrich and Nourbakhsh - 2002
  • Result and discussion
  • Testing at 3 indoors and 1 outdoors locations
    produce between 87 and 98 percent correct
    classification with no incorrect classification
    with high confidence.

24
System 2 Ulrich and Nourbakhsh - 2002
  • Result and discussion, continued.
  • Needs 250ms to compare an input image to 100
    reference images.
  • Quick map-making/labeling process since the
    dimensions of locations are not specified
  • System is sensitive to illumination, which is
    part of the problem for outdoor navigation.

25
Overall discussion
  • Key Limitation in scene-based localization
    resolution
  • Eliminating top-down labeling

26
Key Limitation in scene-based localization
resolution
  • The features does not provide a fine enough pose
    estimation, only indication of a sub-region.
  • Recognizing views from the features may stress
    the limits of the system.
  • Include local features/objects for within the
    place localization, inter-place localization

27
Key Limitation in scene-based localization
resolution
  • System 1 is able to prime detection and location
    of objects using context that is provided by the
    same scene features
  • Can help in moving from topological to metric
    localization domain
  • Still need to work on the pose estimation
  • Have sub-maps within place. Combining them to a
    global map could be an issue.

28
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Using context as priors to infer attributes of
    objects in the image
  • Second term is the output of the place
  • recognition module and the first term
  • can be computed using Bayer rule

29
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • The authors then made two assumptions
  • Objects are a priori conditionally independent
  • Objects properties only influence local features
    (and not, to a significant extent, global
    features), and thus they are independent of each
    other
  • which allows them to discard local features and
    focus only on global effects

30
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • That is
  • where the second term is the output of the place
    recognition module and the first term
  • are the object system discussed in the following
    slides

31
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Using context as priors to detecting objects
    Ot,i.is a binary random variable
  • where Fi(q)P(Ot,i 1Qt q) and can be
    obtained from data. The other term, can be
    approximated using mixture of spherical Gaussian

32
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Results prediction overtime

33
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Results ROC curve for each object

34
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Using context as priors to locating objects Ot,i
    Xt,i
  • using 8x10 bit mask (Mt,i) for grid occupancy to
    provide a crude way to represent location and
    size/shape
  • Where the first term is from the object
    detection module

35
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • and, if the object is absent the second term is
    0, if it is present
  • Adopting product kernel density estimator to
    model joint on Vtg Mt,i

36
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Resulting expected map is set of weighted
    prototypes, where the weights are given by how
    similar the image is to the previous ones with
    this object and place combination

37
System 1 Torralba, Murphy Freeman, Rubin - 2003
  • Preliminary results

38
Eliminating top-down labeling
  • Training is needed to label locations
  • Need bottom-up clustering for places to work in
    truly deal with novel locations
  • Needs a similarity measure which depend on the
    nature of the information
  • Recognition of gateways
  • Know when to add new landmark/place
  • Need to take out useless (indistinct) images
    on-line
  • Landmark quality measure to decide which frames
    featureless or dominated by moving objects.
Write a Comment
User Comments (0)
About PowerShow.com