Title: Evaluation of UMD Object Tracking in Video
1Evaluation of UMD Object Tracking in Video
2VACE Phase I Evaluations
- Multiple teams presented algorithms for various
analysis tasks - Text Detection and Tracking
- Face Detection and Tracking
- People Tracking
- Evaluation was handled by UMD/LAMP and PSU
- Penn State devised metrics and ran evaluations.
- UMD generated ground truth, and implemented
metrics - ViPER was adapted for new evaluations
3Penn State Developed Metrics
- Evaluations should provide a comprehensive,
multifaceted view of challenges with detection
and tracking. - Tracking Methodologies Developed
- Pixel Level Frame Analysis
- Object Level Aggregation
4PSU Frame Evaluations
- Look at the results for each frame, one at a
time. - For each frame, apply a set of evaluation
metrics, independent of the identity of each
object (i.e. find the best match) - These include
- Object count precision and recall.
- Pixel precision and recall over all objects in
frame. - Individual object pixel precision and recall
measures.
5PSU Frame Evaluation
6PSU Object Aggregation for Tracking
- Assume object matching has already been done
(first frame correspondence) - For the life of the object, aggregate some set of
metrics. - A set of distances for each frame.
- Average over life of object, etc.
7But
- Frame metrics throw away tracking information
since there is no frame to frame correspondence - Aggregated tracking metrics require a known
matching. - Does not require unique labeling of objects to
track - Confusion can occur with multiple objects in the
- Most participants did multi frame detection, not
tracking - Even with the known matching, does not handle
tracking adequately, to include things like
confusion and occlusion. - The in both cases, the metrics simply sum over
all frames. - There is no unified metric across time and space
exists
8UMD Maximal Optimal Matching
- Compute score for each possible object match.
- Find the optimal correspondence.
- One-to-one Match For each ground truth object,
get the list of result objects that minimize the
total cost over all possible correspondences. - Multiple Match For each disjoint subset of
ground truth objects, get the disjoint subset of
output objects that minimizes the total cost. - Compute the overall precision and recall.
- For S size of matching
- Precision S / size(candidates)
- Recall S / size(targets)
9Maximal Optimal Matching Advantages
- Takes into account both space and time.
- Can be generalized to make no assumptions about
space and time. - Optimal 1-1 matching has many nice properties.
- Can handle many-to-many matching.
- By pruning data to only compute on sequences that
overlap in time, matching can be made tractable.
10Object Matching
11Object Matching
Truth Data
Result Data
12Experimental Results
- We reran the tracking experiments using the
- Add description of data
- Add description of algorithms used for static and
moving camera - Show graphs for our stuff vs PSU
13Example Tracking Text Frame
14Example Tracking Text Tracking
15Example Tracking Text Object
16Example Person Tracking Frame
17Example Person Tracking Object
18Claims
- Metrics provide for true tracking evaluation (not
just aggregated detection) - Tolerances can still be set on various components
of the distance measure. - Provides a single point of comparison
19Fin
- Dr. David Doermann
- Dr. Rangachar Kasturi
- David Mihalcik
- Ilya Makedon
- many others
20Tracking Graphs
21Object Level Matching
- Most obvious solution many-many matching.
- Allows matching on any data type, at a price.
22Pixel-Frame-Box Metrics
- Look at each frame and ask a specific question
about its contents. - Number of pixels correctly matched.
- Number of boxes that have some overlap.
- Or overlap greater than some threshold.
- How many boxes overlap a given box?
(Fragmentation) - Look at all frames and ask a question
- Number of frames correctly detected.
- Proper number of objects counted.
23Individual Box Tracking Metrics
- Mostly useful for the retrieval problem, this
solution looks at pairs of ground truth boxes and
a result box. - Metrics are
- Position
- Size
- Orientation
24Questions Ignoring Ground Truth
- Assume the evaluation routine is given a set of
objects to ignore (or rules for determining what
type of object to ignore). How does this effect
the output? - For pixel measures, just dont count pixels on
ignored regions. This works for Tracking and
Frame evaluations. - For object matches, do the complete match when
finished, ignore result data that matches ignored
truth.
25Questions Presenting the Results
- Have some basic built in graphs.
- Line graphs for individual metrics
- Bar charts showing several metrics
- For custom graphs, you have to do it yourself.
- ROC Curves
- Scatter Plots