Change Detection in Shape Dynamical Models

About This Presentation

Title:

Change Detection in Shape Dynamical Models

Description:

parts of human body). Use the models for abnormal activity detection and tracking ... motion disorders by modeling normal human actions using 'Shape Activity' models ... – PowerPoint PPT presentation

Number of Views:170

Avg rating:3.0/5.0

Slides: 46

Provided by: namr1

Category:

more less

Transcript and Presenter's Notes

Title: Change Detection in Shape Dynamical Models

1
Change Detection in Shape Dynamical Models
Application to Activity Recognition

Namrata Vaswani
Dept. of Electrical Computer Engineering
University of Maryland, College Park
http//www.cfar.umd.edu/namrata

2
Acknowledgements

Part of this work is joint work with Dr Amit Roy
Chowdhury and Prof Rama Chellappa.

3
Overview

The Group Activity Recognition Problem
Slow and Drastic Change Detection
Landmark Shape Dynamical Models
Experiments and Results

4
The Group Activity Recognition Problem
5
Problem Formulation

The Problem
Model activities performed by a group of moving
and interacting objects (which can be people or
vehicles or robots or diff. parts of human body).
Use the models for abnormal activity detection
and tracking
Our Approach
Treat objects as point objects landmarks.
Changing configuration of objects deforming
shape
Abnormality change from learnt shape dynamics
Related Approaches for Group Activity
Co-occurrence statistics, Dynamic Bayes Nets

6
Bayesian Approach

Define a Stochastic State-Space Model (a
continuous state HMM) for shape deformations in a
given activity, with shape motion forming the
hidden state vector and configuration of objects
forming the observation.
Use a particle filter to track a given
observation sequence, i.e. estimate the hidden
state given observations.
Define Abnormality as a slow or drastic change in
the shape dynamics with unknown change
parameters. We propose statistics for slow
drastic change detection.

7
Human Action Tracking
Cyan Observed Green Ground Truth Red SSA
Blue NSSA
8
Slow and Drastic Change Detection in Continuous
State HMMs
9
The Problem

General Hidden Markov Model (HMM) State seq.
Xt, Observation seq. Yt
Finite duration change in system model which
causes a permanent change in probability
distribution of state
Change parameters unknown Log LRT(Xt) ? LL(Xt)
Noisy Observations LL(Xt) ? ELL(Xt)Y1t)
ELL
Nonlinear dynamics Particle filtered estimate
Slow or drastic change ELL for slow, OL/TE for
drastic

10
Related Work

Change Detection in Nonlinear Systems using PF
Known change parameters, sudden change
Log-LRT of current observation given past
observations
Multimode system? detect change in mode
Unknown change parameters, sudden change
Generalized LRT
Tracking Error (TE)
Neg. Log likelihood of current observation given
past (OL)
Avg. Log Likelihood of i.i.d. observations used
often
But ELL ELL(Xt)Y1t (MMSE of LL given
observations) in context of HMMs is new

11
Particle Filtering
12
Change Detection Statistics

Drastic Change
Tracking Error (TE). If Gaussian noise, TE OL
Neg. of Current Observation Likelihood given past
(OL)
OL -log Pr(YtY0t-1,H0) -logltQt pt-1 ,
?tgt
Slow Change Propose Expected Log Likelihood
(ELL)
ELL Kerridge Inaccuracy b/w pt and pt0
ELL(Y1t )E-log pt0 (Xt)Y1tEp-log pt0
(Xt)K(pt pt0)
Detectable changes using ELL
EELL(Y1t0) K(pt0pt0)H(pt0), EELL(Y1tc)
K(ptcpt0)
Chebyshev Ineq With false alarm miss
probabilities of 1/9, ELL detects all changes
s.t.
K(ptcpt0) -H(pt0)gt3 vVarELL(Y1tc)
vVarELL(Y1t0)

13
OL ELL Slow Drastic Change

Problem of TE or OL Fail to detect slow changes
Particle Filter tracks slow changes correctly
Assuming change till t-1 was tracked correctly
(error in posterior small), OL only uses change
introduced at t
ELL works because it uses total change in
posterior till time t since PF tracks the
posterior correctly for a slow change, ELL is
approximated correctly
Problem of ELL Fails to detect drastic changes
Approximating posterior for changed system using
a PF for unchanged system error large for
drastic changes
OL relies on the error introduced due to change
to detect it, so works for drastic changes
ELL detects change before loss of track, OL/TE
after

14
A Simulated Example

Change introduced in system model from t5 to t15

Tracking Error (or OL)
ELL
15
Detecting Changes

ELL, if approx. accurately (ELLc,c ELLc,0,N
small), will detect all changes as soon as they
become detectable where as OL detects when
OLc,0 significantly larger than OL0,0
Since change parameters unknown, we estimate
ptc,0,N (posterior for changed observations
using a PF optimal for unchanged system)- diff.
from actual ptc,c intro. errors
If PF stable, ptc,0,N-ptc,c lt Incr. Fn
(rate of change)
Slow change ptc,0,N-ptc,c small. So ELL
approximated accurately, ELL detects. But OLc,0
close to OLc,cOL0,0, so OL fails. Vice versa for
drastic changes

16
Practical Issues

Defining pt0(x)
Either use part of state vector which has linear
Gaussian dynamics can define pt(x), in closed
form.
Assume a parametric family for pt(x), learn
parameters using training data (assume pt(x)
piecewise constant over time)
Declare a change when either ELL or OL/TE exceed
their respective thresholds.
Set ELL threshold to H(pt0) 3vVarELL(Y1t0)
Set OL threshold to a little above
EOL0,0H(YtY1t-1)
Single frame estimates of ELL or OL/TE may be
noisy
Average the statistic or average no. of detects
or modify CUSUM

17
Approximation Errors

Total error lt Bounding error Model error PF
error
Bounding error Stability results hold only for
bounded fns but LL is unbounded.
BEELLtc,c-ELLtc,c,M
Model error Error b/w exact filtering with
original system model with changed model,
MEELLtc,c,M-ELLtc,0,M
PF Error Error b/w exact filtering with changed
model particle filtering with changed model,
PEELLtc,0,M-ELLtc,0,M,N

18
Asymptotic Stability, Stability with t

The error in ELL estimation averaged over
observation sequences PF realizations is
asymptotically stable if
Change lasts for a finite time
Unnormalized filter kernels are uniformly
mixing
Certain boundedness assumptions hold
Stability monotonic decrease of error if the
kernels are only mixing
Analysis generalizes to errors in MMSE estimate
of any function of state evaluated using a PF
with system model error

19
Asymptotic Stability

If (i) Change lasts for finite time, (ii)
Unnormalized filter kernels are uniformly mixing,
(iii) Bounded posterior state space increase
of Mt (bound on expected value of LL) with t is
polynomial and (iv) E?t lt ? for all t, then
If (i), (ii), (iii) LL unbounded but expected
value of its bounded approx. converges to true
value uniformly in t and (iv), then
Both 1. 2. imply Error avged. over obs.
sequences PF runs is asymptotically stable

20
Stability

If (i), (ii) Unnormalized filter kernels Mixing,
(iii), then
limN?8(Error avged over obs. seq. PF runs) is
stable ?t eventually strictly decreasing
2. If (i), (ii) Mixing, (iii)Bounded
posterior state space,
limN?8(Error avged over PF runs)/Mt is stable
almost surely for all obs. seq ?t strictly
decreasing.

21
Unnormalized filter kernel mixing

Unnormalized filter kernel, Rt, is state
transition kernel,Qt, weighted by observation
likelihood given state
Mixing measures the rate at which the
transition kernel forgets its initial condition
or eqvtly. how quickly the state sequence becomes
ergodic. Mathematically,
Example State transition Xt Xt-1nt a not
mixing. But if Yth(Xt)wt, wt is truncated
noise, then Rt is mixing

22
Complementary Behavior of ELL/OL

We have shown that etc,0ELLtc,c- ELLtc,0,N is
upper bounded by an increasing function of
OLkc,0, tcltkltt
Implication
Assume detectable change i.e. ELLc,c large
OL fails gt OLkc,0,tcltkltt small gt ELL error,
etc,0 smallgt ELLc,0 large gtELL detects
ELL fails gt ELLc,0 small gtELL error, etc,0
large gt at least one of OLkc,0,tcltkltt large gt
OL detects

23
Rate of Change Bound

The total error in ELL estimation is upper
bounded by increasing functions of the rate of
change (or system model error per time step)
with all increasing derivatives.
OLc,0 is upper bounded by increasing function of
rate of change.
Metric for rate of change (or equivalently
system model error per time step) for a given
observation Yt DQ,t is

24
The Bound
Assume Change for finite time, Unnormalized
filter kernels mixing, Posterior state space
bounded
25
Implications

If change slow, ELL works and OL does not work
ELL error can blow up very quickly as rate of
change increases (its upper bound blows up)
A small error in both normal changed system
models introduces less total error than a perfect
transition kernel for normal system large error
in changed system
A sequence of small changes will introduce less
total error than one drastic change of same
magnitude

26
More Practical Issues

Estimates from single frames are noisy and
affected by outliers
Average the no. of detects over past p time
instants
Or average the statistic over past p time
instants
aOL(p) (1/p) -log Pr(Yt-p1tY0t-p,H0)
Either avg. ELL, aELL, or use joint ELL over
past p states, jELL(p) (1/p)E-log
pt-pt(Xt-pt)Yot
Or, modify CUSUM for unknown change parameters,
i.e. change if max1ltpltt dp gt ?, dp
Statistic(p)Tp,t.

27
We have shown

Asymptotic stability of errors in ELL estimation
if change lasts for a finite time, unnormalized
filter kernels are uniformly mixing some
boundedness assumptions hold.
Stability for large N if the kernels are only
mixing
ELL error upper bounded by an increasing fn. of
OLc,0 ELL works when OL fails vice versa
ELL error upper bounded by an increasing fn. of
rate of change, with incr. derivatives of all
orders. OLc,0 upper bounded by increasing fn. of
rate of change
Analysis generalizes to errors in MMSE estimate
of any fn. of state evaluated using a PF with
system model error

28
Applications/ Possible Applications

Surveillance abnormal activity detection
Medical applications Detect motion disorders by
modeling normal human actions using Shape
Activity models
ELL PSSA model for activity segmentation
Neural signal processing detecting changes in
stimuli
Congestion Detection
System model change detection in target tracking
problems without the tracker loses track

29
Landmark Shape Dynamical Models
30
What is Shape?

Shape is the geometric information that remains
when location, scale and rotation effects are
filtered out Kendall
Shape of k landmarks in 2D
Represent the X Y coordinates of the k points
as a k-dimensional complex vector Configuration
Translation Normalization Centered Configuration
Scale Normalization Pre-shape
Rotation Normalization Shape

31
Activities on the Shape Sphere in Ck-1
32
Related Work

Related Approaches for Group Activity
Co-occurrence Statistics
Dynamic Bayesian Networks
Shape Analysis/Deformation
PDMs,Thin plate splines, Principal Partial
warps
Active Shape Models affine deformation in
configuration space
Deformotion Euclidean motion of avg. shape
deformation
Piecewise geodesic models for tracking on
Grassmann manifolds
Particle Filters for Multiple Moving Objects
JPDAF (Joint Probability Data Association
Filter) difficult to define complicated
interactions b/w objects

33
Motivation

Obtain a generic sensor invariant approach for
activities performed by multiple moving
objects. Easy to fuse sensors
Why shape invariant to translation, zoom
in-plane rotation of camera
Single global framework for modeling motion
interactions, co-occurrence statistics requires
individual joint histograms.
New framework to track a group of interacting
moving objects know that the group is
constrained to move in a certain fashion
defined by the activity.
Active Shape Models good for approximately rigid
objects (small nonrigidity introduced by camera
motion)

34
The HMM

Observation, Yt Centered configurations
State, Xt?t, ct, st, ?t
Current Shape (?t),
Shape Velocity (ct) Tangent coordinate w.r.t.
?t
Scale (st),
Rotation angle (?t)
Use complex vector notation to simplify equations
Use a particle filter to approximate the optimal
non-linear filter, pt(dx) Pr(Xt?dxY0t)
posterior state distribution conditioned on
observations upto time t, by an N-particle
empirical estimate of pt

35
State Dynamics

Shape Dynamics
Define shape velocity at time t in the tangent
space w.r.t. the current shape, ?t
Tangent space is a vector space define a linear
Gaussian-Markov model for shape speed, ct
Move ?t by an amount ct on shape manifold to
get ?t1
Motion Dynamics
Linear Gauss-Markov dynamics for log st,
unwrapped ?t

36
HMM Equations
Observation Model Map Shape,Motion?Centered
Config.
System Model Shape and Motion Dynamics
Shape Dynamics
Motion Dynamics

Linear Gauss-Markov models for log st and ?t
Can be stationary or non-stationary

37
Special Cases

Stationary Shape Activity (SSA) ?t µ, constant
Models shape variation in a single tangent space
w.r.t mean shape.
Track normal behavior, detect abnormality
Non-Stationary Shape Activity ?t changes for all
t
Tangent space changes at every time instant
Most flexible Detect abnormality and also track
it
Piecewise Stationary Shape Activity ?t p.w.
constant
Change time can be fixed or decided on the fly
using ELL
PSSA ELL Activity Segmentation

38
Stationary, Non-Stationary
39
Stationary Shape Activity

Mean shape is constant, so set ?t µ (Procrustes
mean), for all t, ?t not part of state vector,
learn mean shape using training data.
Define a single tangent space w.r.t. µ shape
dynamics simplifies to linear Gauss-Markov model
in tangent space
Since shape space is not a vector space, data
mean may not lie in shape space, evaluate
Procrustes mean an intrinsic mean on the shape
manifold.

40
What is Procustes Mean?

Proc mean µ, minimizes sum of squares of Proc
distances of the set of pre-shapes from itself
Proc distance is Euclidean distance between
Proc fit of one pre-shape onto another
Proc fit scale or rotate a pre-shape to
optimally align it with another pre-shape
Optimally minimum Euclidean distance between
the two pre-shapes after alignment

41
Learning Procrustes Mean Dryden Mardia

Translation Scale normaln. of config. ?
Pre-Shape
Procrustes fit of pre-shape w onto y
Procrustes distance

Procrustes mean of a set of pre-shapes wi
Shape zi Procrustes fit of wi onto mean, µ

43
Learning Stationary Shape Dynamics
Learn Procrustes mean, µ
µ
Learn Linear Gauss Markov (G-M) model in tangent
space
µ, Sv, A, Sn
44
Abnormal Activity Detection

Define abnormal activity as
Slow or drastic change in shape statistics with
change parameters unknown.
System is a nonlinear HMM, tracked using a PF
This motivated research on slow drastic change
detection in general HMMs
Tracking Error detects drastic changes. We
proposed a statistic called ELL for slow change.
Use a combination of ELL Tracking Error and
declare change if either exceeds its threshold.

45
Applications/Possible Applications

Modeling group activity to detect suspicious
behavior
Airport example
Lane change detection in traffic
Model human actions track a given sequence of
actions, detect abnormal actions (medical
application to detect motion disorders)
Activity sequence segmentation unsupervised
training
Sensor independent IR/Radar/Seismic
Robotics, Medical Image Processing

46
Apply to Gait Verification

Model diff. parts of human body head, torso,
fore hind arms legs as landmarks
Learn the landmark shape dynamics for diff.
peoples gait
Verification Given a test seq. a possible
match (say, from face recognition stage), verify
if match is correct
Start tracking test seq. using the shape
dynamical model of the possible match
If dynamics does not match at all, PF will lose
track
If dynamics is close but not correct, ELL w.r.t.
the possible match will exceed its threshold

47
Gait Recognition

System Identification approach
Assuming the test seq. has negligible observation
noise, learn the shape dynamical model parameters
for the test sequence
Find distance of parameters for test seq. from
those for diff people in the database. Similar
idea to Soatto, Ashok
Match time series of shape velocity of probe
gallery
Save the shape velocity sequence for the diff
people in database.
Test sequence estimate the shape velocity seq.,
use DTW Kale to match it against all peoples
gait

48
Experiments and Results
49
Why Particle Filter?

N does not increase (much) with incr. state dim.,
Approx. posterior distrib. of state only in high
probability regions (so fixed N works for all t
state space at t Dt)
Better than Extended KF because of asymptotic
stability
Able to track in spite of wrong initial
distribution
Get back in track after losing track due to an
outlier observation
Slowly changing system Able to track it and yet
detect the change using ELL (explained later)
Can handle Multi-Modal prior/posterior, EKF cannot

50
Time-Varying No. of Landmarks?

Ill posed problem Interpolate the curve formed
by joining the landmarks re-sample it to a
fixed no. of landmarks k
Experimented with 2 interpolation/re-sampling
schemes
Uniform Re-samples independently along x y
Assumes observed landmarks are uniformly sampled
from some continuous function of a dummy variable
s
All observed landmark get equal weight while
re-sampling
Very sensitive to change in of landmarks, but
also able to detect abnormality caused by two
closely spaced points
Arc-length Parameterizes x y coordinates by
the length of the arc upto that landmark
Assumes observed landmarks are non-uniformly
sampled points from a continuous fn. of length x
(l), y (l)
Smoothens out motion of closely spaced points,
thus misses abnormality caused by two closely
spaced points

51
Experiments

Group Activity
Normal activity Group of people deplaning
walking towards airport terminal used SSA model
Abnormality A person walks away in an un-allowed
direction distorts the normal shape
Simulated walking speeds of 1,2,4,16,32 pixels
per time step (slow to drastic distortion in
shape)
Compared detection delays using TE and ELL
Plotted ROC curves to compare performance
Human actions
Defined NSSA model for tracking a figure skater
Abnormality abnormal motion of one body part
Able to detect as well as track slow abnormality

52
Normal/Abnormal Activity
Normal Activity
Abnormal Activity
53
Abnormality

Abnormality introduced at t5
Observation noise variance 9
OL plot very similar to TE plot (both same to
first order)

Tracking Error (TE)
ELL
54
ROC ELL

Plot of Detection delay against Mean time b/w
False Alarms (MTBFA) for varying detection
thresholds
Plots for increasing observation noise

Drastic Change ELL Fails
Slow Change ELL Works
55
ROC Tracking Error(TE)

ELL Detection delay 7 for slow change ,
Detection delay 60 for drastic
TE Detection delay 29 for slow change,
Detection delay 4 for drastic

Slow Change TE Fails
Drastic Change TE Works
56
ROC Combined ELL-TE

Plots for observation noise variance 81
(maximum)
Detection Delay lt 8 achieved for all rates of
change

57
Human Action Tracking
Cyan Observed Green Ground Truth Red SSA
Blue NSSA
58
Normal Action SSA better than NSSA
Abnormality NSSA works, SSA fails
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
59
NSSA Tracks and Detects Abnormality
Tracking Error
ELL
Red SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
Green Observed, Magenta SSA, Blue NSSA
60
Temporal Abnormality

Abnormality introduced at t 5, Observation Noise
Variance 81
Using uniform re-sampling, Not detected using
arc length

61
Contributions

Slow and drastic change detection in general HMMs
using particle filters. We have shown
Asymptotic stability / stability of errors in ELL
approximation
Complementary behavior of ELL OL for slow
drastic changes
Upper bound on ELL error is an increasing
function of rate of change, with all increasing
derivatives
Stochastic state space models (HMMs) for
simultaneously moving and deforming shapes.
Stationary, non-stationary p.w. stationary
cases
Group activity human actions modeling,
detecting abnormality
NSSA for tracking slow abnormality, ELL for
detecting it
PSSA ELL Apply to activity segmentation

62
Other Contributions

A linear subspace algorithm for pattern
classification motivated by PCA
Approximates the optimal Bayes classifier for
Gaussian pdfs with unequal covariance matrices.
Useful for apples from oranges type problems.
Derived tight upper bound on its classification
error probability
Compared performance with Subspace LDA both
analytically experimentally
Applied to object recognition, face recognition
under large pose variation, action retrieval.
Fast algorithms for infra-red image compression

63
Ongoing and Future Work

Change Detection
Bound on Errors is increasing fn. of rate of
change Implications
CUSUM algorithm, applications to other problems
Non-Stationary Piecewise Stationary Shape
Activities
Application to sequences of different kinds of
actions
PSSA ELL for activity segmentation
Time varying number of Landmarks?
What is best strategy to get a fixed no. k
of landmarks?
Can we deal with changing dimension of shape
space?
Sequence of Activities, Multiple Simultaneous
Activities
Multi-Sensor Fusion, 3D Shape, General shape
spaces

64
Special Cases

For i.i.d. observation sequence, Yth(Xt)wt
ELL(Y0t)E-log pt(Xt)Y0tE-log pt(Xt)Yt
-log pt(h-1(Yt)-Eh-1(wt))- log
pt(h-1(Yt))
OL(Yt)const. if Eh-1(wt)0
For zero (negligible) observation noise case,
Yth(Xt)
ELL(Y0t) E-log pt(Xt)Y0t -log
pt(h-1(Yt))OL(Yt)const.

65
Particle Filtering Algorithm

At t0, generate N Monte Carlo samples from
initial state distribution, p0p0
For all t,
Prediction Given posterior at t-1 as an
empirical distr., pt-1, sample from the state
transition kernel Qt (xt-1,dxt) to generate
samples from ptt-1
Update/Correction
Weight each sample of ptt-1 by probability of
the observation given that sample, ?t(Ytx) pt
Use multinomial sampling to resample from these
weighted particles to generate particles
distributed according to pt

66
Classification and Tracking Algorithms Using
Landmark Shape Analysis and their Application to
Face and Gait

Namrata Vaswani
Dept. of Electrical Computer Engineering
University of Maryland, College Park
http//www.cfar.umd.edu/namrata

67
Principal Component Null Space Analysis (PCNSA)
for Face Recognition
68
Related Work

PCA uses projection directions with maximum
inter-class variance but do not minimize the
intra-class variance
LDA uses directions that maximize the ratio of
inter-class variance and intra-class variance
Subspace LDA for large dimensional data, use PCA
for dim. reduction followed by LDA
Multi-space KL (similar to PCNSA)
Other work ICA, Kernel PCA LDA, Neural nets

69
Motivation

Example PCA or SLDA good for face recognition
under small pose variation, PCNSA proposed for
larger pose variation
PCNSA addresses such Apples from Oranges type
classification problems
PCA assumes classes are well separated along all
directions in PCA space Si Ss2I
SLDA assumes all classes have similar directions
of min. max. variance Si S, for all i
If minimum variance direction for one class is
maximum variance for the other a worst case for
SLDA or PCA

70
PCNSA Algorithm

Subtract common mean µ, Obtain PCA space
Project all training data into PCA space,
evaluate class mean, covariance in PCA space µi,
Si
Obtain class Approx. Null Space (ANS) for each
class Mi trailing eigenvectors of Si
Valid classification directions in ANS if
distance between class means is significant
WiNSA
Classification Project query Y into PCA space,
XWPCAT(Y- µ), choose Most Likely class, c, as

71
Assumptions Extensions

Assumptions required For each class,
(i) an approx. null space exists,
(ii) valid classification directions exist
Progressive-PCNSA
Defines a heuristic for choosing dimension of ANS
when (ii) is not satisfied.
Also defines a heuristic for new (untrained)
class detection

72
Typical Data Distributions
Apples from Apples problem All algorithms work
well
Apples from Oranges problem Worst case for
SLDA, PCA
73
Classification Error Probability

Two class problem. Assumes 1-dim ANS, 1 LDA
direction
Generalizes to M dim ANS and to non-Gaussian but
unimodal symmetric distributions

74
Applications
Face recognition, Large pose variation
Face recognition, Large expression variation
Facial Feature Matching
75
Experimental Results

PCNSA misclassified least, followed by SLDA and
then PCA
New class detection ability of PCNSA was better
PCNSA most sensitive to training data size, PCA
most robust

76
Discussion Ideas

PCNSA test approximates the LRT (optimal Bayes
solution) as condition no. of Si tends to
infinity
Fuse PCNSA and LDA get an algorithm very similar
to Multispace KL
For multiclass problems, use error probability
expressions to decide which of PCNSA or SLDA is
better for a given set of 2 classes
Perform facial feature matching using PCNSA, use
this for face registration followed by warping to
standard geometry