Title: Change Detection in Shape Dynamical Models
1Change Detection in Shape Dynamical Models
Application to Activity Recognition
- Namrata Vaswani
- Dept. of Electrical Computer Engineering
- University of Maryland, College Park
- http//www.cfar.umd.edu/namrata
2Acknowledgements
- Part of this work is joint work with Dr Amit Roy
Chowdhury and Prof Rama Chellappa.
3Overview
- The Group Activity Recognition Problem
- Slow and Drastic Change Detection
- Landmark Shape Dynamical Models
- Experiments and Results
4The Group Activity Recognition Problem
5Problem Formulation
- The Problem
- Model activities performed by a group of moving
and interacting objects (which can be people or
vehicles or robots or diff. parts of human body).
Use the models for abnormal activity detection
and tracking - Our Approach
- Treat objects as point objects landmarks.
- Changing configuration of objects deforming
shape - Abnormality change from learnt shape dynamics
- Related Approaches for Group Activity
- Co-occurrence statistics, Dynamic Bayes Nets
6Bayesian Approach
- Define a Stochastic State-Space Model (a
continuous state HMM) for shape deformations in a
given activity, with shape motion forming the
hidden state vector and configuration of objects
forming the observation. - Use a particle filter to track a given
observation sequence, i.e. estimate the hidden
state given observations. - Define Abnormality as a slow or drastic change in
the shape dynamics with unknown change
parameters. We propose statistics for slow
drastic change detection.
7Human Action Tracking
Cyan Observed Green Ground Truth Red SSA
Blue NSSA
8Slow and Drastic Change Detection in Continuous
State HMMs
9The Problem
- General Hidden Markov Model (HMM) State seq.
Xt, Observation seq. Yt - Finite duration change in system model which
causes a permanent change in probability
distribution of state - Change parameters unknown Log LRT(Xt) ? LL(Xt)
- Noisy Observations LL(Xt) ? ELL(Xt)Y1t)
ELL - Nonlinear dynamics Particle filtered estimate
- Slow or drastic change ELL for slow, OL/TE for
drastic
10Related Work
- Change Detection in Nonlinear Systems using PF
- Known change parameters, sudden change
- Log-LRT of current observation given past
observations - Multimode system? detect change in mode
- Unknown change parameters, sudden change
- Generalized LRT
- Tracking Error (TE)
- Neg. Log likelihood of current observation given
past (OL) - Avg. Log Likelihood of i.i.d. observations used
often - But ELL ELL(Xt)Y1t (MMSE of LL given
observations) in context of HMMs is new
11Particle Filtering
12Change Detection Statistics
- Drastic Change
- Tracking Error (TE). If Gaussian noise, TE OL
- Neg. of Current Observation Likelihood given past
(OL) - OL -log Pr(YtY0t-1,H0) -logltQt pt-1 ,
?tgt - Slow Change Propose Expected Log Likelihood
(ELL) - ELL Kerridge Inaccuracy b/w pt and pt0
- ELL(Y1t )E-log pt0 (Xt)Y1tEp-log pt0
(Xt)K(pt pt0) - Detectable changes using ELL
- EELL(Y1t0) K(pt0pt0)H(pt0), EELL(Y1tc)
K(ptcpt0) - Chebyshev Ineq With false alarm miss
probabilities of 1/9, ELL detects all changes
s.t. - K(ptcpt0) -H(pt0)gt3 vVarELL(Y1tc)
vVarELL(Y1t0)
13OL ELL Slow Drastic Change
- Problem of TE or OL Fail to detect slow changes
- Particle Filter tracks slow changes correctly
- Assuming change till t-1 was tracked correctly
(error in posterior small), OL only uses change
introduced at t - ELL works because it uses total change in
posterior till time t since PF tracks the
posterior correctly for a slow change, ELL is
approximated correctly - Problem of ELL Fails to detect drastic changes
- Approximating posterior for changed system using
a PF for unchanged system error large for
drastic changes - OL relies on the error introduced due to change
to detect it, so works for drastic changes - ELL detects change before loss of track, OL/TE
after
14A Simulated Example
- Change introduced in system model from t5 to t15
Tracking Error (or OL)
ELL
15Detecting Changes
- ELL, if approx. accurately (ELLc,c ELLc,0,N
small), will detect all changes as soon as they
become detectable where as OL detects when
OLc,0 significantly larger than OL0,0 - Since change parameters unknown, we estimate
ptc,0,N (posterior for changed observations
using a PF optimal for unchanged system)- diff.
from actual ptc,c intro. errors - If PF stable, ptc,0,N-ptc,c lt Incr. Fn
(rate of change) - Slow change ptc,0,N-ptc,c small. So ELL
approximated accurately, ELL detects. But OLc,0
close to OLc,cOL0,0, so OL fails. Vice versa for
drastic changes
16Practical Issues
- Defining pt0(x)
- Either use part of state vector which has linear
Gaussian dynamics can define pt(x), in closed
form. - Assume a parametric family for pt(x), learn
parameters using training data (assume pt(x)
piecewise constant over time) - Declare a change when either ELL or OL/TE exceed
their respective thresholds. - Set ELL threshold to H(pt0) 3vVarELL(Y1t0)
- Set OL threshold to a little above
EOL0,0H(YtY1t-1) - Single frame estimates of ELL or OL/TE may be
noisy - Average the statistic or average no. of detects
or modify CUSUM
17Approximation Errors
- Total error lt Bounding error Model error PF
error - Bounding error Stability results hold only for
bounded fns but LL is unbounded.
BEELLtc,c-ELLtc,c,M - Model error Error b/w exact filtering with
original system model with changed model,
MEELLtc,c,M-ELLtc,0,M - PF Error Error b/w exact filtering with changed
model particle filtering with changed model,
PEELLtc,0,M-ELLtc,0,M,N
18Asymptotic Stability, Stability with t
- The error in ELL estimation averaged over
observation sequences PF realizations is
asymptotically stable if - Change lasts for a finite time
- Unnormalized filter kernels are uniformly
mixing - Certain boundedness assumptions hold
- Stability monotonic decrease of error if the
kernels are only mixing - Analysis generalizes to errors in MMSE estimate
of any function of state evaluated using a PF
with system model error
19Asymptotic Stability
- If (i) Change lasts for finite time, (ii)
Unnormalized filter kernels are uniformly mixing,
(iii) Bounded posterior state space increase
of Mt (bound on expected value of LL) with t is
polynomial and (iv) E?t lt ? for all t, then - If (i), (ii), (iii) LL unbounded but expected
value of its bounded approx. converges to true
value uniformly in t and (iv), then - Both 1. 2. imply Error avged. over obs.
sequences PF runs is asymptotically stable
20Stability
- If (i), (ii) Unnormalized filter kernels Mixing,
(iii), then - limN?8(Error avged over obs. seq. PF runs) is
stable ?t eventually strictly decreasing - 2. If (i), (ii) Mixing, (iii)Bounded
posterior state space, - limN?8(Error avged over PF runs)/Mt is stable
almost surely for all obs. seq ?t strictly
decreasing.
21Unnormalized filter kernel mixing
- Unnormalized filter kernel, Rt, is state
transition kernel,Qt, weighted by observation
likelihood given state - Mixing measures the rate at which the
transition kernel forgets its initial condition
or eqvtly. how quickly the state sequence becomes
ergodic. Mathematically, - Example State transition Xt Xt-1nt a not
mixing. But if Yth(Xt)wt, wt is truncated
noise, then Rt is mixing
22Complementary Behavior of ELL/OL
- We have shown that etc,0ELLtc,c- ELLtc,0,N is
upper bounded by an increasing function of
OLkc,0, tcltkltt - Implication
- Assume detectable change i.e. ELLc,c large
- OL fails gt OLkc,0,tcltkltt small gt ELL error,
etc,0 smallgt ELLc,0 large gtELL detects - ELL fails gt ELLc,0 small gtELL error, etc,0
large gt at least one of OLkc,0,tcltkltt large gt
OL detects
23Rate of Change Bound
- The total error in ELL estimation is upper
bounded by increasing functions of the rate of
change (or system model error per time step)
with all increasing derivatives. - OLc,0 is upper bounded by increasing function of
rate of change. - Metric for rate of change (or equivalently
system model error per time step) for a given
observation Yt DQ,t is
24 The Bound
Assume Change for finite time, Unnormalized
filter kernels mixing, Posterior state space
bounded
25Implications
- If change slow, ELL works and OL does not work
- ELL error can blow up very quickly as rate of
change increases (its upper bound blows up) - A small error in both normal changed system
models introduces less total error than a perfect
transition kernel for normal system large error
in changed system - A sequence of small changes will introduce less
total error than one drastic change of same
magnitude
26More Practical Issues
- Estimates from single frames are noisy and
affected by outliers - Average the no. of detects over past p time
instants - Or average the statistic over past p time
instants - aOL(p) (1/p) -log Pr(Yt-p1tY0t-p,H0)
- Either avg. ELL, aELL, or use joint ELL over
past p states, jELL(p) (1/p)E-log
pt-pt(Xt-pt)Yot - Or, modify CUSUM for unknown change parameters,
i.e. change if max1ltpltt dp gt ?, dp
Statistic(p)Tp,t.
27We have shown
- Asymptotic stability of errors in ELL estimation
if change lasts for a finite time, unnormalized
filter kernels are uniformly mixing some
boundedness assumptions hold. - Stability for large N if the kernels are only
mixing - ELL error upper bounded by an increasing fn. of
OLc,0 ELL works when OL fails vice versa - ELL error upper bounded by an increasing fn. of
rate of change, with incr. derivatives of all
orders. OLc,0 upper bounded by increasing fn. of
rate of change - Analysis generalizes to errors in MMSE estimate
of any fn. of state evaluated using a PF with
system model error
28Applications/ Possible Applications
- Surveillance abnormal activity detection
- Medical applications Detect motion disorders by
modeling normal human actions using Shape
Activity models - ELL PSSA model for activity segmentation
- Neural signal processing detecting changes in
stimuli - Congestion Detection
- System model change detection in target tracking
problems without the tracker loses track
29Landmark Shape Dynamical Models
30What is Shape?
- Shape is the geometric information that remains
when location, scale and rotation effects are
filtered out Kendall - Shape of k landmarks in 2D
- Represent the X Y coordinates of the k points
as a k-dimensional complex vector Configuration - Translation Normalization Centered Configuration
- Scale Normalization Pre-shape
- Rotation Normalization Shape
31Activities on the Shape Sphere in Ck-1
32Related Work
- Related Approaches for Group Activity
- Co-occurrence Statistics
- Dynamic Bayesian Networks
- Shape Analysis/Deformation
- PDMs,Thin plate splines, Principal Partial
warps - Active Shape Models affine deformation in
configuration space - Deformotion Euclidean motion of avg. shape
deformation - Piecewise geodesic models for tracking on
Grassmann manifolds - Particle Filters for Multiple Moving Objects
- JPDAF (Joint Probability Data Association
Filter) difficult to define complicated
interactions b/w objects
33Motivation
- Obtain a generic sensor invariant approach for
activities performed by multiple moving
objects. Easy to fuse sensors - Why shape invariant to translation, zoom
in-plane rotation of camera - Single global framework for modeling motion
interactions, co-occurrence statistics requires
individual joint histograms. - New framework to track a group of interacting
moving objects know that the group is
constrained to move in a certain fashion
defined by the activity. - Active Shape Models good for approximately rigid
objects (small nonrigidity introduced by camera
motion)
34The HMM
- Observation, Yt Centered configurations
- State, Xt?t, ct, st, ?t
- Current Shape (?t),
- Shape Velocity (ct) Tangent coordinate w.r.t.
?t - Scale (st),
- Rotation angle (?t)
-
- Use complex vector notation to simplify equations
- Use a particle filter to approximate the optimal
non-linear filter, pt(dx) Pr(Xt?dxY0t)
posterior state distribution conditioned on
observations upto time t, by an N-particle
empirical estimate of pt
35State Dynamics
- Shape Dynamics
- Define shape velocity at time t in the tangent
space w.r.t. the current shape, ?t - Tangent space is a vector space define a linear
Gaussian-Markov model for shape speed, ct - Move ?t by an amount ct on shape manifold to
get ?t1 - Motion Dynamics
- Linear Gauss-Markov dynamics for log st,
unwrapped ?t
36HMM Equations
Observation Model Map Shape,Motion?Centered
Config.
System Model Shape and Motion Dynamics
Shape Dynamics
Motion Dynamics
- Linear Gauss-Markov models for log st and ?t
- Can be stationary or non-stationary
37Special Cases
- Stationary Shape Activity (SSA) ?t µ, constant
- Models shape variation in a single tangent space
w.r.t mean shape. - Track normal behavior, detect abnormality
- Non-Stationary Shape Activity ?t changes for all
t - Tangent space changes at every time instant
- Most flexible Detect abnormality and also track
it - Piecewise Stationary Shape Activity ?t p.w.
constant - Change time can be fixed or decided on the fly
using ELL - PSSA ELL Activity Segmentation
38Stationary, Non-Stationary
39Stationary Shape Activity
- Mean shape is constant, so set ?t µ (Procrustes
mean), for all t, ?t not part of state vector,
learn mean shape using training data. - Define a single tangent space w.r.t. µ shape
dynamics simplifies to linear Gauss-Markov model
in tangent space - Since shape space is not a vector space, data
mean may not lie in shape space, evaluate
Procrustes mean an intrinsic mean on the shape
manifold.
40What is Procustes Mean?
- Proc mean µ, minimizes sum of squares of Proc
distances of the set of pre-shapes from itself - Proc distance is Euclidean distance between
Proc fit of one pre-shape onto another - Proc fit scale or rotate a pre-shape to
optimally align it with another pre-shape - Optimally minimum Euclidean distance between
the two pre-shapes after alignment
41Learning Procrustes Mean Dryden Mardia
- Translation Scale normaln. of config. ?
Pre-Shape - Procrustes fit of pre-shape w onto y
- Procrustes distance
42- Procrustes mean of a set of pre-shapes wi
- Shape zi Procrustes fit of wi onto mean, µ
43Learning Stationary Shape Dynamics
Learn Procrustes mean, µ
µ
Learn Linear Gauss Markov (G-M) model in tangent
space
µ, Sv, A, Sn
44Abnormal Activity Detection
- Define abnormal activity as
- Slow or drastic change in shape statistics with
change parameters unknown. - System is a nonlinear HMM, tracked using a PF
- This motivated research on slow drastic change
detection in general HMMs - Tracking Error detects drastic changes. We
proposed a statistic called ELL for slow change. - Use a combination of ELL Tracking Error and
declare change if either exceeds its threshold.
45Applications/Possible Applications
- Modeling group activity to detect suspicious
behavior - Airport example
- Lane change detection in traffic
- Model human actions track a given sequence of
actions, detect abnormal actions (medical
application to detect motion disorders) - Activity sequence segmentation unsupervised
training - Sensor independent IR/Radar/Seismic
- Robotics, Medical Image Processing
46Apply to Gait Verification
- Model diff. parts of human body head, torso,
fore hind arms legs as landmarks - Learn the landmark shape dynamics for diff.
peoples gait - Verification Given a test seq. a possible
match (say, from face recognition stage), verify
if match is correct - Start tracking test seq. using the shape
dynamical model of the possible match - If dynamics does not match at all, PF will lose
track - If dynamics is close but not correct, ELL w.r.t.
the possible match will exceed its threshold
47Gait Recognition
- System Identification approach
- Assuming the test seq. has negligible observation
noise, learn the shape dynamical model parameters
for the test sequence - Find distance of parameters for test seq. from
those for diff people in the database. Similar
idea to Soatto, Ashok - Match time series of shape velocity of probe
gallery - Save the shape velocity sequence for the diff
people in database. - Test sequence estimate the shape velocity seq.,
use DTW Kale to match it against all peoples
gait
48Experiments and Results
49Why Particle Filter?
- N does not increase (much) with incr. state dim.,
Approx. posterior distrib. of state only in high
probability regions (so fixed N works for all t
state space at t Dt) - Better than Extended KF because of asymptotic
stability - Able to track in spite of wrong initial
distribution - Get back in track after losing track due to an
outlier observation - Slowly changing system Able to track it and yet
detect the change using ELL (explained later) - Can handle Multi-Modal prior/posterior, EKF cannot
50Time-Varying No. of Landmarks?
- Ill posed problem Interpolate the curve formed
by joining the landmarks re-sample it to a
fixed no. of landmarks k - Experimented with 2 interpolation/re-sampling
schemes - Uniform Re-samples independently along x y
- Assumes observed landmarks are uniformly sampled
from some continuous function of a dummy variable
s - All observed landmark get equal weight while
re-sampling - Very sensitive to change in of landmarks, but
also able to detect abnormality caused by two
closely spaced points - Arc-length Parameterizes x y coordinates by
the length of the arc upto that landmark - Assumes observed landmarks are non-uniformly
sampled points from a continuous fn. of length x
(l), y (l) - Smoothens out motion of closely spaced points,
thus misses abnormality caused by two closely
spaced points
51Experiments
- Group Activity
- Normal activity Group of people deplaning
walking towards airport terminal used SSA model - Abnormality A person walks away in an un-allowed
direction distorts the normal shape - Simulated walking speeds of 1,2,4,16,32 pixels
per time step (slow to drastic distortion in
shape) - Compared detection delays using TE and ELL
- Plotted ROC curves to compare performance
- Human actions
- Defined NSSA model for tracking a figure skater
- Abnormality abnormal motion of one body part
- Able to detect as well as track slow abnormality
52Normal/Abnormal Activity
Normal Activity
Abnormal Activity
53Abnormality
- Abnormality introduced at t5
- Observation noise variance 9
- OL plot very similar to TE plot (both same to
first order)
Tracking Error (TE)
ELL
54ROC ELL
- Plot of Detection delay against Mean time b/w
False Alarms (MTBFA) for varying detection
thresholds - Plots for increasing observation noise
Drastic Change ELL Fails
Slow Change ELL Works
55ROC Tracking Error(TE)
- ELL Detection delay 7 for slow change ,
Detection delay 60 for drastic - TE Detection delay 29 for slow change,
Detection delay 4 for drastic
Slow Change TE Fails
Drastic Change TE Works
56ROC Combined ELL-TE
- Plots for observation noise variance 81
(maximum) - Detection Delay lt 8 achieved for all rates of
change
57Human Action Tracking
Cyan Observed Green Ground Truth Red SSA
Blue NSSA
58Normal Action SSA better than NSSA
Abnormality NSSA works, SSA fails
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
59NSSA Tracks and Detects Abnormality
Tracking Error
ELL
Red SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
60Temporal Abnormality
- Abnormality introduced at t 5, Observation Noise
Variance 81 - Using uniform re-sampling, Not detected using
arc length
61Contributions
- Slow and drastic change detection in general HMMs
using particle filters. We have shown - Asymptotic stability / stability of errors in ELL
approximation - Complementary behavior of ELL OL for slow
drastic changes - Upper bound on ELL error is an increasing
function of rate of change, with all increasing
derivatives - Stochastic state space models (HMMs) for
simultaneously moving and deforming shapes. - Stationary, non-stationary p.w. stationary
cases - Group activity human actions modeling,
detecting abnormality - NSSA for tracking slow abnormality, ELL for
detecting it - PSSA ELL Apply to activity segmentation
62Other Contributions
- A linear subspace algorithm for pattern
classification motivated by PCA - Approximates the optimal Bayes classifier for
Gaussian pdfs with unequal covariance matrices. - Useful for apples from oranges type problems.
- Derived tight upper bound on its classification
error probability - Compared performance with Subspace LDA both
analytically experimentally - Applied to object recognition, face recognition
under large pose variation, action retrieval. - Fast algorithms for infra-red image compression
63Ongoing and Future Work
- Change Detection
- Bound on Errors is increasing fn. of rate of
change Implications - CUSUM algorithm, applications to other problems
- Non-Stationary Piecewise Stationary Shape
Activities - Application to sequences of different kinds of
actions - PSSA ELL for activity segmentation
- Time varying number of Landmarks?
- What is best strategy to get a fixed no. k
of landmarks? - Can we deal with changing dimension of shape
space? - Sequence of Activities, Multiple Simultaneous
Activities - Multi-Sensor Fusion, 3D Shape, General shape
spaces
64Special Cases
- For i.i.d. observation sequence, Yth(Xt)wt
- ELL(Y0t)E-log pt(Xt)Y0tE-log pt(Xt)Yt
- -log pt(h-1(Yt)-Eh-1(wt))- log
pt(h-1(Yt)) - OL(Yt)const. if Eh-1(wt)0
- For zero (negligible) observation noise case,
Yth(Xt) - ELL(Y0t) E-log pt(Xt)Y0t -log
pt(h-1(Yt))OL(Yt)const.
65Particle Filtering Algorithm
- At t0, generate N Monte Carlo samples from
initial state distribution, p0p0 - For all t,
- Prediction Given posterior at t-1 as an
empirical distr., pt-1, sample from the state
transition kernel Qt (xt-1,dxt) to generate
samples from ptt-1 - Update/Correction
- Weight each sample of ptt-1 by probability of
the observation given that sample, ?t(Ytx) pt - Use multinomial sampling to resample from these
weighted particles to generate particles
distributed according to pt
66Classification and Tracking Algorithms Using
Landmark Shape Analysis and their Application to
Face and Gait
- Namrata Vaswani
- Dept. of Electrical Computer Engineering
- University of Maryland, College Park
- http//www.cfar.umd.edu/namrata
67Principal Component Null Space Analysis (PCNSA)
for Face Recognition
68Related Work
- PCA uses projection directions with maximum
inter-class variance but do not minimize the
intra-class variance - LDA uses directions that maximize the ratio of
inter-class variance and intra-class variance - Subspace LDA for large dimensional data, use PCA
for dim. reduction followed by LDA - Multi-space KL (similar to PCNSA)
- Other work ICA, Kernel PCA LDA, Neural nets
69Motivation
- Example PCA or SLDA good for face recognition
under small pose variation, PCNSA proposed for
larger pose variation - PCNSA addresses such Apples from Oranges type
classification problems - PCA assumes classes are well separated along all
directions in PCA space Si Ss2I - SLDA assumes all classes have similar directions
of min. max. variance Si S, for all i - If minimum variance direction for one class is
maximum variance for the other a worst case for
SLDA or PCA
70PCNSA Algorithm
- Subtract common mean µ, Obtain PCA space
- Project all training data into PCA space,
evaluate class mean, covariance in PCA space µi,
Si - Obtain class Approx. Null Space (ANS) for each
class Mi trailing eigenvectors of Si - Valid classification directions in ANS if
distance between class means is significant
WiNSA - Classification Project query Y into PCA space,
XWPCAT(Y- µ), choose Most Likely class, c, as
71Assumptions Extensions
- Assumptions required For each class,
- (i) an approx. null space exists,
- (ii) valid classification directions exist
- Progressive-PCNSA
- Defines a heuristic for choosing dimension of ANS
when (ii) is not satisfied. - Also defines a heuristic for new (untrained)
class detection
72Typical Data Distributions
Apples from Apples problem All algorithms work
well
Apples from Oranges problem Worst case for
SLDA, PCA
73Classification Error Probability
- Two class problem. Assumes 1-dim ANS, 1 LDA
direction - Generalizes to M dim ANS and to non-Gaussian but
unimodal symmetric distributions
74Applications
Face recognition, Large pose variation
Face recognition, Large expression variation
Facial Feature Matching
75Experimental Results
- PCNSA misclassified least, followed by SLDA and
then PCA - New class detection ability of PCNSA was better
- PCNSA most sensitive to training data size, PCA
most robust
76Discussion Ideas
- PCNSA test approximates the LRT (optimal Bayes
solution) as condition no. of Si tends to
infinity - Fuse PCNSA and LDA get an algorithm very similar
to Multispace KL - For multiclass problems, use error probability
expressions to decide which of PCNSA or SLDA is
better for a given set of 2 classes - Perform facial feature matching using PCNSA, use
this for face registration followed by warping to
standard geometry