Title: Computer Vision 2 mm2
1Computer Vision 2 mm2
- Agenda
- Model-based Computer Vision
- What is it
- How does it work (brick example)
- What to remember
- Advanced tracking
- Multiple hypotheses tracking
- What to remember
- Exercise
2Model-based CV According to TBM
- What can it be used for?
- Pose estimation
- Tracking (pose estimation over time)
- Object recognition
- What is it?
- Everything is based on a model
- Contains a geometrical model
- Analysis-by-synthesis approach
3Model-based CV Characteristics
- Contains a (3D) geometrical model
- Cylinder, ellipsoid, box, truncated cones, etc.
- Brick represented by a box
- Analysis-by-synthesis approach (AbS)
- Project different configurations of the brick
into the image - Assume camera calibration
- Compare with image data via similarity measure
- Highest similarity gt pose of brick
4Analysis-by-Synthesis
- Model representation
- Image representation
- Matching
5Representation
- Pixels Template
- Particle Feature vector
- Low level token Geometric primitives
- Edges, lines, corners, circles, ellipses, etc.
- High level token
- Geometric
-
Complexity
6Representation
7AbS Model Representation
- Model representation state-space representation
- Degrees of freedom (DoF)
- External and internal DoF
- State-space is spanned by the DoF
- One point in state-space one state, which
defines one configuration of the object - DoF for a brick?
- How would you represent these DoF?
8AbS Model Representation
- Internal (geometric shape)
- Length, width, height (3 DoF)
- 1, relative width, relative height, scale (3 DoF)
- Relative width and height known, scale (1 DoF)
- Known (0 DoF)
- External (pose)
- CoG, corner Cartesian (3 DoF)
- Angles Around fixed axes, Eulers, Screw axis (3
DoF), Rodriguezs par., Eulers par.,
Quaternions (4 DoF) - Screw axis representation (helical axis rep.) (6
DoF )
9AbS Image Representation
- Image representation
- Edges Contours Silhouettes
10AbS - Matching
- Compare every possible model configuration (pose
geometry) with the image data - This is done by projecting each configuration
into the image and calculating a similarity
measure - Match configuration most similar to image data
- Why is this in general difficult?
11Why is matching so difficult?
- Huge state-space gt too many configurations
- Brick 9 DoF
- Resolution 1mm and 1deg
- Limits
- Internal 0100mm and 0200mm and 0300mm
- External 01000mm and 0360deg
- Size of state-space ( of different
configurations) - 100200300100033603 2.81023 !!!!!!!!
- Human skeleton 20 DoF 36020 1051
infinity - Brute force whatever!!!
- Search space solution space state-space 1
Dimension
12What can we do about it? (1)
- Reduce
- Resolution, DoF (measured beforehand)
- Constraints on state-space parameters
- Based on setup and physics
- Based on image pre-processing
13What can we do about it? (2)
- Assume a smooth and uni-modal
- surface in the solution space
- Apply an iterative approach
- Coarser-to-finer search
- Gradient search in solution space
- Other methods exist
- Be aware of local minima!
- After the break.
14What to remember
- Model-based Computer Vision
- Usage pose estimation, object rec. and tracking
- Geometrical model Cylinders, boxes, ..
- Analysis-by-synthesis approach
- Project model into the image and compare
- Model representation
- State-space representation, degrees-of-freedom
(DoF) - Image representation
- Edges, contours, silhouettes
- Matching
- Brute force is not possible!
- Apply constraints and some kind of search strategy
15Multiple-Hypotheses Tracking
- Why care about this?
- Theory
- The principle of factored sampling
- The Condensation algorithm
- Examples
- What to remember
16Why care about Multiple Hypotheses Tracking?
- Tracking pose estimation over time
- Iterative approaches work well in uni-modal
solution spaces. Init. Guess from prediction. - BUT in practice the solution space is multi modal
due to - Many DoF in state space
- Local min/max
- Noise in the image
- The background is similar to the object
- The object is infected (occlusion, bad
measurements,) - Result We will loose track!
- Solution We need to support all likely modes,
i.e., multiple-hypotheses
17Multiple-Hypotheses Tracking
- Concepts
- Predict all likely hypotheses and compare them
with the image data (measurements) - That is, project all predicted hypotheses into
the image and calculate a similarity measure
between each projection and the image data
(measurements) - Think of everything as Probability Density
Functions (PDF)
18Multiple-Hypotheses Tracking
- The best match (pose) is found where
- is maximum. This is denoted
- Maximum A Posterior (MAP)
- Ignoring c and adding a time index
19Multiple-Hypotheses Tracking
- A priori information Prediction of the
information in the previous frame
20Multiple-Hypotheses Tracking
- Bayes
- A priori
- Problem
- We cannot calculate and
due to the huge solution space - Solution Estimate using the
CONDENSATION algorithm - Aka Sequential Monte Carlo, Particle Filter,
Multiple hypotheses tracking
21The Condensation algorithm
- Condensation
- Conditional Density Propagation
- Based on the principle of factored sampling
22The Condensation algorithm
- Condensation factored sampling over time
- Meaning that is
predicted from the posterior in the previous
frame
23The Condensation algorithm
Posterior at time K-1
Predicted state at time K
Posterior at time K
24Illustration of Condensation
25Tracking multiple objects
- Track one object gt track multiple objects for
free
26Example Pointing Gesture
State-space
27Example Pointing Gesture
Input
Init.
Result
MAP
28Condensation demos
29What to remember
- Many DoF gt multi-modal PDF
- We need Multi hypotheses tracking
- Solution Bayes rule
- Estimate posterior via Condensation
- Factored sampling over time
- 3 steps sampling, predicting, weighting
- Condensation Particle filter Sequential
- Monte Carlo Multiple hypotheses tracker
30Implementation issues
- Init The algorithm requires P(x0z0)
- P(x0z0) P(z0x0)
- P(x0z0) uniform density
- P(x0z0) constant density (train off-line)
- Motion Model The more correct the better
- Number of samples N
- Depends on solution space (dim(x) and
resolution), quality of the predictions (motion
model and process noise), and quality of the
measurements (measurement noise) - Fx N100, N1000, N 500-1500
- N can be changed from frame to frame, e.g. N(unc.)
31The Condensation algorithm
- Visual tracking in complex scenes
- Based on Particle Filtering gt Estimation of
Bayes rule - General Non-Gaussian densities and/or
- high dimensional problems
- Condensation Conditional Density Propagation
32The Kalman Filter
Deterministic drift
Stoc. diffusion
Effect of measurements
Model
33Propagation of densities in the KF
- Gaussian densities. Only 2 parameters mean and
covar
34Multi modal densities
35Example Pointing Gesture
36The Condensation algorithm
Posterior at time K-1
Predicted state at time K
Posterior at time K
37Chamfer Matching
- Generates a more smooth search space