Early Cognitive Vision

About This Presentation

Title:

Early Cognitive Vision

Description:

Early Cognitive Vision –

Number of Views:60

Avg rating:3.0/5.0

Slides: 62

Provided by: psych153

Category:

more less

Transcript and Presenter's Notes

Title: Early Cognitive Vision

1
Early Cognitive Vision

Recursive Mid-Level Vision
ECOVISION Summary from year 3
ECOVISION Highlights in year 3
Hardware implementation of flow stereo
IMO detection and space variant mappings
Motion-Stereo Gestalts for scene disambiguation
Conclusions

2
Hierarchical Image Processing
Pixels
Data and Noise reduction Extraction of
Meaningful Information (first steps)
Low-Level Vision
Features
Spatio-temporal Context Grouping,
Segmentation, Task-dependent Attention Self-Emerge
nce of Entities
Primitives
Mid-Level Vision
Gestalts
High-Level Vision (Cognition)
Higher Cognitive Aspects Reasoning
Objects
3
Summary Motion Part
Normal Flow, Hardware Implementation
Smoothing by MT-cell filtering (Neuro)
First extract Heading, then subtract and
then extract all other coarse flow segments.
Fine structure analysis relying on the
RBM principle.
4
Summary Stereo-Part
Early vision steps (year 1)
Gestalts in Space
Gestalts in Space-Time Recursive, predictive
processing
5
Real-time processing

Hardware implementation (FPGA)

6
Motivation

Massive parallel processing
Taking advantage of the digital technology
advances
Specific purpose processing architectures
6 Mgates on a single chip
Motion processing
Stereo processing
Space variant mapping
Motion-driven object tracking

7
System-on-Chip Real time processing
8
Motion on chip
Different motion processing schemes evaluated in
software (Lucas Kanade, McGM, Horn Schunk,
Simoncelli Heeger, etc)

Only two approaches have been addressed in
hardware
McGM
Motivation Robust optic flow estimation.
Status only the first convolutionary stages
implemented (towards an hybrid approach
sotfware/hardware)
Lucas Kanade
Motivation Good quality vs computational
complexity trade-off
Status fully working on an FPGA (System-on-chip)

Motion chip (LK) accuracy evaluated with
benchmark sequences and tested in a real-world
application scenario.
9
Hardware Implementation of Lucas Kanade
Kpps Resolution Fps
Medium Quality 1776 160x120 320x240 97 26
Medium Quality 625 160x120 33
High Quality 1776 160x120 320x240 95 25
Low cost 400 120x90 38
lt 20 ?
Kpps Kilo pixels per second Fps Frames per
second Averaging stage Medium Quality (3x3) ,
High Quality (5x5)
Spatial vs Temporal resolution
10
Hardware optic flow results

The estimation is correct when the overtaking
relative velocity is significant
The system has been tested on the instrumented car

11
Stereo on chip
Different stereo algorithms considered Lucas
Kanade, Phase-based (Silvio et al), block
matching.

Phase-based stereo processing (Silvio et al)
Motivation Know-how at ECOVISION, low
complexity computation
Status Fully working on an FPGA platform

Frame Grabber
Phased based stereo system
VGA controller
Direct phase difference calculation
Local contrast
Gabor Filters
Frame Grabber
Precision
8
9
9
11
12
Phase-Based Dynamic Stereopsis
Disparity as phase difference
Direct phase diference computation
where
with
13
Stereo hardware
SPECS Device occupation On-chip memory Mpps Embedded multipliers Image Resol. Fps Max. Fclk (MHz)
Global system 6165 (18) 23 (15) 31.5 26 (18) 640x480 102 31.5
14
Direct phase difference calculation module
15
Playing with stereo in real-time manipulation of
objects
16
Stereo system data flow
FPGA COARSE GRAIN PIPELINE
17
Stereo hardware
SPE CS Device occupation On-chip memory Mpps Embedded multipliers Image Resol. Fps Max. Fclk (MHz)
Global system 6165 (18) 23 (15) 31.5 26 (18) 640x480 102 31.5
Subcircuits device occupation On-chip memory Embedded multipliers Cycles required Max. Fclk (MHz)
2 Frame-grabber 2 VGA controllers 1921 (5) 0 0 1 50
Local Contrast 792 (2) 11 (7) 5 (3) 1 120
Gabors filters 610 (1) 0 8 (5) 1 83
Direct phase Difference calculation 861 (2) 1 (1) 8 (5) 1 47
Cameras Calibration 1070 (3) 0 0 - 50
18
Hardware Implementation SPECS (Virtex E 2000)

3x3 model.
Fast version (1776 Kpps)
Hardware slices occupation 54 .
BlockRAM memory occupation 17 .
Slow version (625 Kpps)
Hardware slices occupation 43 .
BlockRAM memory occupation 17 .
5x5 model
Fast version (1776 Kpps)
Hardware slices occupation 82 .
BlockRAM memory occupation 23 .
Low cost (3x3 model, 400 Kpps)
Hardware slices occupation 36 .
BlockRAM memory occupation 8 .

19
Extracting speed from raw optic flow data
solutions
It is possible to compensate for the effect of
perspective by doing a remapping.
Reduce this area
Expand this area

The advantages of the remapping are
The speed of the car is more uniform.
The divergence caused by the expansion of the car
is reduced.

20
Space Variant Mapping (SVM) 102 Fps with the
circuit running at 31.5 MHz.
Pipeline stage Number of Slices device occupation on-chip memory Image size Max. Clk (MHz)
Frame-Grabber and Manage Memory modules 883 2.8 0 640 x 480 94.0
IPM 2,454 7.8 0 640 x 480 69.6
Total system 3,564 11.5 0 640 x 480 44.8
21
Tracking examples

Static overtaking.

Foggy and rainy day.
22
Tracking examples II

Switch off lights car

Truck overtaking
23
Tracking examples III
Multiple car fast overtaking
24
Extracting speed from raw optic flow data
difficulties

Due to the effect of perspective
The car will appear to move faster as it
approaches the camera (even though its real speed
is constant)
A spurious expansion is added to the
translational movement of the car.

Car is dark grey
Car is white
25
Stereo and Motion stand-alone platforms
Motion processing platform
Stereo processing platform
26
Segmenting Independent Motion Overview

Robust extraction of egomotion from optic flow
Spatio-temporal filtering of residual flow field
using motion angle
using Kalman filter (partner Ita)
Task-specific remapping
improved optic flow computation (partner Eng)

27
Egomotion Extraction

Novel algorithm
outperforms all linear algorithms
performs close to optimal algorithms

proposed method
linear
optimal
28
Egomotion Extraction

Advantage over optimal algorithms largely
increased robustness to local minima
important when using robust estimation techniques
that introduce additional local minima(e.g.
Tukey M-estimator)

29
Motion Segmentation

After egomotion computation, each optic flow
vector is decomposed in a static (environment)
and moving (independent motion) component
Spatio-temporal smoothing of angular deviations
from the static components yields independently
moving regions

30
Motion Segmentation

Using Kalman filtering, elementary flow
components can be matched to residual flow
vectors ( vectors obtained after subtraction of
static components) (partner Ita)
incorporates spatio-temporal contextual
information
object-based segmentation

31
Task-specific Remapping

Rear view mirror scenario
Inverse perspective mapping (partner Eng)
Optic flow computed both in original and remapped
space
large velocities in original space are smaller in
remapped space, which facilitates their
calculation

original flow original space
remapped flow remapped space
remapped flow original space
32
Task-specific Remapping

Fuse flow in original space and segment
independent motion

original flow only
fused flow
33
Speed constancy achieved by the Inverse
Perspective Mapping

As the car approaches the camera, it appears to
be accelerating although it travels at constant
speed.
It can be seen that in the speed image (bottom
left) dark grey (low speed) progressively becomes
white (high speed).

In the remapped sequence the increase in the
image speed of the car is significantly reduced.

34
Speed constancy a quantitative analysis
Methodology

We manually segment the car in 20 frames in the
original and remapped sequences.
Over the car, we compute the mean speed and its
standard deviation.

Results

The cars mean speed is considerably more stable
in the remapped sequence. The ratio between
minimum and maximum speed is 1.35 compared to
7.71 in the original sequence
The dispersion of speed values over the car also
shows remarkable stability in the remapped
sequence.

remapped sequence
original sequence
35
Recursive mid-level vision
36
The Primitive Extraction Scheme
37
Stereo and Grouping (1)

Why using the primitive grouping for stereo ?
Line primitives are ambiguous along an edge.
Consistency in primitives should be conserved by
stereo.
Considering groups for stereo largely reduces the
number of candidates.

38
Stereo and Grouping (2)
Without grouping constraint
With grouping constraint
39
Quantification Method

Generated stereo colour sequences with ground
truth using colour range data provided by Riegl
(www.riegl.com).
Advantages
Natural images natural textures, surfaces and
illuminations
Accurate ground truth for stereo and motion

40
Performance

Grouping for stereo
Improves consistently stereo performance.
Offer larger improvements for low similarity
threshold
Combined use of lower thresholds yield better
reliability / density trade-offs.
The optimal choice of threshold depends on the
application (need for reliability vs density)

41
Performance
Performance correct / total stereo matches
Similarity between Groups
Inner Similarity
42
Performance
Performance
Density
43
Grouping and Interpolation
44
RBM Estimation (1)
Formalisation of RBM
Visual Entities
3D Point / 3D Line 3D Point / 3D Plane
Twists
Numerical Optimisation
Householder

Taylor
Approximation

Constraints
System of Linear Equations
Shortest Euclidian Distance of 3D Entities
Rosenhahn, Granert and Sommer 2002
45
RBM Estimation (2)

One needs only some twenty 3D-point-to-2D-line
correspondences to compute an RBM, compared to
some 10.000 primitives extracted !
The search for the ego-motion is processed as
follows
Use strong grouping constraint to select a set of
highly reliable correspondences, over time and
stereo
Estimate the quality of the computed motion using
reprojection of 3D hypothesis.
Discard correspondences leading to wrong motion.

46
Stereo over time (1)

Stereo problem structures parallel to epipolar
line (Horizontal)
Due to the physical set-up of the cameras
(fronto-parallel).
If the motion is known then stereo between the
same camera at instants T and TN can be
computed.
If the motion is not a pure lateral translation
then the epipolar lines will have different
orientations.
Tri-focal constraint

47
Reconstruction All hypotheses
48
Stereo Over Time (2)
Reconstruction of both stereos (Accumulation over
5 frames)
Standard stereo reconstruction (no horizontal
structure)
Reconstruction using Stereo over time (no radial
structures)
49
Reconstruction combining standard(advanced)
stereo with stereo-over-time
50
Final step 3D Accumulation over Timedoing
everything

If there is a correspondence under transformation
T
increase confidence and
merge the two entities
else
decrease confidence

51
3D Accumulation Over Timeoriginal frame
52
3D Accumulation Over Time1st iteration
53
3D Accumulation Over Time2nd iteration
54
3D Accumulation Over Time3rd iteration
55
3D Accumulation Over Time4th iteration
56
3D Accumulation Over Time5th iteration
57
Some final result
58
Conclusion

The individual parts of the ECOVISION system work
well and have been quantitatively tested.
Some parts have been tested directly in cars
Other parts have been tested off-line with real
driving scenes
Integration of Stereo and Motion using RBM was
successful
Integration of IMO detection and space variant
mapping, too
Integration of the above two parts has not been
achieved in the tenure of this grant
Real time performance has been achieved with the
hardware front ends.
Real time performance of the complete system
would require about 12 more PMs of programming
Several Grant proposals have been put in (locally
and at the Commission) to continue this work.

59
(No Transcript)
60
The pixel in the remapped image at coordinates
X,Y come from the coordinates x,y in the
original image.
61
Advantages of Grouping for RBM Estimation

For each entity in the top row there are 6
correspondences. Grouping leads to a reduction
from 66 46,656 to 224 correspondences only.
Correspondences with no fitting attributes (e.g.
colour) can be discarded.
c) Local position and orientation can be quite
distorted. Grouping can improve the accuracy of
such local estimates(cf. interpolation).

Write a Comment

User Comments (0)