Summer Work at Vidient, 2006 - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Summer Work at Vidient, 2006

Description:

... Gradients (HoG) ... Concatenate the 9D HoG with the average RGB values over the 5x5 ... separate part ensembles, one of HoG features and one of color ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 25

Provided by: shan88

Category:

more less

Transcript and Presenter's Notes

Title: Summer Work at Vidient, 2006

1
Summer Work at Vidient, 2006

Ensemble Tracking,Part-Based Trackingand
merging Mean-Shift with theEarth Movers Distance

2
Ensemble Tracking Sampling

Collect many pixels in the region of the object
Build a feature for each pixel, label each ,-

3
The Feature Type

Histograms of Gradients (HoG)
Calculated over a 5x5 pixel region centered on
the pixel of interest
Concatenate the 9D HoG with the average RGB
values over the 5x5 pixel region to form a 12D
feature vector

4
Ensemble Tracking Training

Train a weak classifier using Linear SVM on this
collection of pixels
Weight the weak classifier using AdaBoost
Combine the weak classifiers into a strong
classifier

5
Ensemble Tracking Tracking

Output of strong classifier is where T number
of classifiers
Convert classifier output to a psuedo-probability
using a sigmoid function and create a confidence
map
Track the object by applying mean-shift on the
confidence map

6
Ensemble Tracking Update

Collect a new set of samples based on the new
object position
Remove the oldest weak classifier in the
Ensemble
Re-train remaining weak classifiers using
AdaBoost
Train a new weak classifier and add it to the
Ensemble

7
Some Minor Notes

We used an outlier rejection scheme which
sometimes threw out nearly every sample. We limit
the number that can be removed
We give classifiers zero weight if they have an
error greater than 0.45
If during update too many classifiers have a zero
weight we train an extra classifier to ensure
there are enough good classifiers to track the
object in the next frame

8
Ensemble Tracking Results
9
Ensemble Tracking Failings

Difficult to represent the overall object using
such tiny (5x5 pixel) regions
Typical mean-shift problem of finding a local
maximum and therefore obtaining poor tracking
results
Imperfect tracking is made worse since we train a
new classifier on background regions

10
Part Tracking A New Approach

Try and find the parts of an object (arm, leg,
hood, wheel) and keep a list of these parts
Build an object template based on the spatial
relations between these parts
Track the object in future frames by sliding the
template and finding the best match
Update the template by removing either the oldest
or poorly performing parts and training
replacement parts

11
The Feature Types

Maintain two separate part ensembles, one of HoG
features and one of color features
Each list has the same number of parts
HoG features can be of any size and are have 9
bins
Color features are 4x4x4 histograms

12
The Part Representation

Each part contains A feature vector, the
position of the part in relation to the first
part in the list, an age, An average Euclidean
distance between feature vector and similar-sized
parts in the background (this is used for
evaluating performance and weighting)

13
Choosing the Best Part

An exhaustive search of every possible part
Calculate the average Euclidean distance between
proposed part and the background
Choose the part with the highest average distance

14
Tracking Using the Part Ensemble

Slide the parts as a whole, find the best match
based on a weighted vote of all parts
Similar to a template matching where one
generates a template on-the-fly

15
Updating the Part Ensemble

Remove the oldest part
Train a new part using the exhaustive search
method discussed previously
We also tried removing poorly performing
(low-weight) parts, but results degraded
Poorly performing parts will only last several
frames before they become the oldest and will be
removed

16
Results of the Part Tracker
17
Failure of the Part Tracker

Features are too sparse, difficult to track using
just a few weak, unstable features
Difficult to handle partial occlusions. If the
majority of parts became occluded quickly (less
than N/2 frames) then unable to track the object
Drifting problem again. How do we know when it is
OK to train a new part, and when we are training
on background introduced from drifting?

18
Mean Shift Tracking

Obtain mean-shift vector y by maximizing the
Bhattacharyya coefficient, which is equivalent to
minimizing the distancemaximizewhere
First term in p is independent of y so only need
the second term

Bhattacharyya coefficient for a single bin u
19
The Bhattacharyya Coefficient

Compares bin i from distribution A with bin i
from distribution B, so only corresponding bins
are matched
So the distance between two distributions is

20
The Earth Movers Distance

Compares bins in distribution A with near-by bins
in distribution B
Allows for close-matches, not as strict as
Bhattacharyya coefficient
Where c is a cost function (distance between
histogram bins), fiJ is the amount of flow from
bin i to bin J, and yJ is the total amount of
flow to bin j (a normalization factor)

21
Combining EMD with Mean Shift

So the original equation of maximizing the
Bhattacharyya coefficientbecomes a matter of
minimizing the EMDwhereand D(x, y) is the
Euclidean distance function

EMD for a single bin u
22
Results of the EMD-MS Tracker
23
Comparisons and Conclusions

No ground truth, so cannot make an absolute
comparison, only subjective!!!
Part-based tracker tends to get better
localization than the Ensemble Tracker, and the
length of time the object is able to be tracked
before being lost is roughly equal
Part-based tracker has fewer user-defined
parameters and is more ad-hoc, Ensemble Tracker
was developed by several people and refined
EMD-MS tracks for more frames than both the
Ensemble Tracker and the part-based tracker but
suffers from high-speed small-scale drifting (ie,
it jitters)
EMD-MS 25 Hz Ensemble Tracker 10 Hz
Part Tracker 7 Hz (?)
Tests were performed over 18 video sequences

24
References

S. Avidan. Ensemble Tracking. In Proc. IEEE
Conf. on Computer Vision and Pattern
Recognition, San Diego, CA, 2005.
N. Dalal and B. Triggs. Histograms of oriented
gradients for human detection. Conference on
Computer Vision and Pattern Recognition (CVPR),
2005.
Q. Zhu, S. Avidan, M.C. Yeh and K.T. Cheng. Fast
Human Detection Using a Cascade of Histograms of
Oriented Gradients. IEEE Computer Vision and
Pattern Recognition 2006 (CVPR 2006) June, NYC,
USA.
P. Viola and M. Jones. Rapid Object Detection
using a Boosted Cascade of Simple Features.
Conference on Computer Vision and Pattern
Recognition 2001 (CVPR 2001).
D. Comaniciu, V. Ramesh and P. Meer. Real-Time
Tracking of Non-Rigid Objects using Mean Shift.
IEEE Conf. on Computer Vision and Pattern
Recognition (CVPR), Hilton Head Island, South
Carolina, 2000.
Y. Rubner, C. Tomasi and L.J. Guibas. A Metric
for Distributions with Applications to Image
Databases. IEEE International Conference on
Computer Vision (CVPR), Bombay, India, 1998.
D. Wojtaszek, R. Laganiére. Tracking and
Recognizing People in Colour using the Earth
Movers Distance. IEEE International Workshop,
2002.