Title: Invariants concluded Lowe and Biederman
1Invariants (concluded) Lowe and Biederman
2Announcements
- No class Thursday. Attend Rao lecture.
- Double-check your paper assignments.
3Key Points
- Rigid rotation is 3x3 orthonormal matrix.
- 3-D Translation is 3x4 matrix.
- 3-D Translation Rotation is 3x4 matrix.
- Scaled Orthographic Projection Remove row three
and allow scaling. - Planar Object, remove column 3.
- Projective Transformations
- Rigid Rotation of Planar Object Represented by
3x3 matrix. - When we write in homogeneous coordinates,
projection implicit. - When we drop rigidity, 3x3 matrix is arbitrary.
4Projective
Rigid rotation and translation. Notation suggests
that first two columns are orthonormal, and
transformation has 6 degrees of freedom.
Projective Transformation Notation suggests that
transformation is unconstrained linear
transformation. Points in homogenous coordinates
are equivalent. Transformation has 8 degrees of
freedom, because its scale is arbitrary.
5Lines Parameterization
- Equation for line axbyc0.
- Parameterize line as l (a,b,c)T.
- p(x,y,1)T is on line if ltp,lgt0.
6Line Intersection
- The intersection of l and l is l x l (where x
denotes the cross product). - This follows from the fact that the cross product
is orthogonal to both lines.
7Intersection of Parallel Lines
- Suppose l and l are parallel. We can write
l(a,b,c), l (a,b,c). l x l
(c-c)(b,-a,0). This equivalent to (b,-a,0). - This point corresponds to a line through the
focal point that doesnt intersect the image
plane. - We can think of the real plane as points (a,b,c)
where c isnt equal to 0. When c 0, we say
these points lie on the ideal line at infinity. - Note that a projective transformation can map
this to another line, the horizon, which we see.
8Invariants of Lines
- Notice that affine transformations are the
subgroup of projective transformations in which
the last row is (0, 0, 1). - These map the line at infinity to itself.
- So parallel lines are affine invariants, since
they continue to intersect at infinity.
9Invariance in 3D to 2D
- 3D to 2D Invariance isnt captured by
mathematical definition of invariance because 3D
to 2D transformations dont form a group. - You cant compose or invert them.
- Definition Let f be a function on images. We
say f is an invariant iff for every Object O, if
I1 and I2 are images of O, f(I1)f(I2). - This means we can define f(O) as f(I) for I any
image of O. O and I match only if f(O)f(I). - f is a non-trivial invariant if there exist two
image I1 and I2 such that f(I1)f(I2).
10Non-Invariance in 3D to 2D
- Theorem Assume valid objects are any 3D point
sets of size k, for some k. Then there are no
non-trivial invariants of the images of these
objects under perspective projection.
11Proof Strategy
- Let f be an invariant.
- Suppose two objects, A and B have a common image.
Then f(I)f(J) if I and J are images of either A
or B. - Given any O0, Ok, we construct a series of
objects, O1, , O(k-1), so that Oi and O(i1)
have a common image for all i, and Ok and j have
a common image. - So for any pair of images, I, J, from any two
objects, f(I) f(J).
12Constructing O1 Ok-1
- Oi has its first i points identical to the first
i points of Ok, and the remaining points
identical to the remaining points of O0. - If two objects are identical except for one
point, they produce the same image when viewed
along a line joining those two points. - Along that line, those two points look the same.
- The remaining points always look the same.
13Summary
- Planar objects give rise to rich set of
invariants. - 3-D objects have no invariants.
- We can deal with this by focusing on planar
portions of objects. - Or special restricted classes of objects.
- Or by relaxing notion of invariants.
- However, invariants have become less popular in
computer vision due to these limitations.
14Lowe and Biederman
- Background
- Viewpoint Invariant Non-Accidental Properties.
- Lowe sees these as probabilistic.
- Biederman drops this.
- Primitive properties
- Composing them into units/geons.
- Use in Recognition.
- Speed search.
- Geons analogy to speech.
- Evidence for Value.
- Computational speed.
- Human psychology parts qualitative
descriptions view invariance.
15Background
- Computational
- 2D approach to recognition.
- Lowe is reacting to Marr.
- Partly due to Lowe, recognition rarely involves
reconstruction now. (But also 3D models more
rare). - State of the art
- Little recognition of 3D objects, grouping
implicit. - Speed, robustness a big concern.
- 2D recognition through search.
- Psychology
- Much more ambitious and specific than any prior
theory of recognition (I believe). - P.O. widely studied, rarely related to other
tasks. - Contrast.
- CS must account for low-level processing.
- Psych must account for categorization.
16Viewpoint Invariant NAPs
- Non-Accidental Property
- Happens rarely by chance
- More frequently by scene structure.
- p property, c chance, s structure.
Jepson and Richards consider this
Lowe focuses on this
This is high due to viewpoint invariance.
- Biederman downplays probabilistic inference.
- Not concerned with background, feature detection.
17Examples
(Copied from Lowe)
18Issues with Non-Accidental Properties
- Is it just Bayesian inference?
- Then why not model all information?
- This may fit Lowe
- Biederman relies more on certain inference.
- See also Feldman, Jepson, Richards.
19Viewpoint Invariance
- Match properties that are invariant to viewing
conditions. - Parallelism, symmetry, collinearity,
cotermination, straightness. - Lowe picks one side of property, Biederman
stresses contrast. Why? - How used?
- Lowe, correspondence of geometric features.
Speed up search - Description of parts for indexing.
20Geons
- Biederman, description of geons. Are they still
view invariant when describing a geon? - 3D shapes occluding contour depends on
viewpoint. May be straight from one view, curved
from another. - Metric properties not truly invariant.
- Maybe more like quasi-invariants.
21Geons for Recognition
- Analogy to speech.
- 36 different geons.
- Different relations between them.
- Millions of ways of putting a few geons together.
22Empirical Support for Geons
- First, divide geons predictions
- Part structure is important in recognition.
- Perceptual grouping can be used for filling in.
- NAPs are used for indexing.
- View invariant descriptions.
- Qualitative descriptions.
- Second, what is alternative?
- View-based recognition with many examples.
23Empirical Support
- Recognition is fast. Fine metric judgments are
slow. - Does this disqualify other approaches?
- Recognition is view-invariant.
- Does this disqualify other approaches?
- Number of geon descriptions sufficient for number
of categories we recognize. - Argues plausibility, but no more.
24Empirical Support (2)
- 2-4 Geons needed for recognition. Complex
objects no harder than simple ones. - Line Drawings vs. Colored images. Color similar
speed.
25Empirical Support (3) Degraded Objects
- Deleting contours that interfere with geon
structure interferes more. - Deleting Components worse than midsections.
- This argues for perceptual organization for
interpolation/reconstruction. But for geons? - Should we measure information deleted rather than
contour length?
26(No Transcript)
27(No Transcript)
28Conclusions
- Maybe helpful to separate
- Perceptual organization/completion.
- View Invariance
- Part Structure.
- All three widely used in computer vision.
- Biedermans paper probably addresses
view-invariance least. - This became subject of much research.