Title: PCA vs ICA vs LDA
1PCA vs ICA vs LDA
2How to represent images?
- Why representation methods are needed??
- Curse of dimensionality width x height x
channels - Noise reduction
- Signal analysis Visualization
- Representation methods
- Representation in frequency domain linear
transform - DFT, DCT, DST, DWT,
- Used as compression methods
- Subspace derivation
- PCA, ICA, LDA
- Linear transform derived from training data
- Feature extraction methods
- Edge(Line) Detection
- Feature map obtained by filtering
- Gabor transform
- Active contours (Snakes)
-
3What is subspace? (1/2)
- Find a basis in a low dimensional sub-space
- Approximate vectors by projecting them in a low
dimensional - sub-space
(1) Original space representation
(2) Lower-dimensional sub-space representation
4What is subspace? (2/2)
5PRINCIPAL COMPONENT ANALYSIS (PCA)
6Why Principal Component Analysis?
- Motive
- Find bases which has high variance in data
- Encode data with small number of bases with low
MSE
7Derivation of PCs
Assume that
Find qs maximizing this!!
Principal component q can be obtainedby
Eigenvector decomposition such as SVD!
8Dimensionality Reduction (1/2)
- Can ignore the components of less significance.
-
- You do lose some information, but if the
eigenvalues are small, you dont lose much - n dimensions in original data
- calculate n eigenvectors and eigenvalues
- choose only the first p eigenvectors, based on
their eigenvalues - final data set has only p dimensions
9Dimensionality Reduction (2/2)
10Reconstruction from PCs
q1
q2
q4
q8
Original Image
q16
q32
q64
q100
11LINEAR DISCRIMINANT ANALYSIS (LDA)
12Limitations of PCA
Are the maximal variance dimensions the relevant
dimensions for preservation?
13Linear Discriminant Analysis (1/6)
- Perform dimensionality reduction while
preserving as much of the class discriminatory
information as possible. - Seeks to find directions along which the classes
are best separated. - Takes into consideration the scatter
within-classes but also the scatter
between-classes. - For example of face recognition, more capable of
distinguishing image variation due to identity
from variation due to other sources such as
illumination and expression.
14Linear Discriminant Analysis (2/6)
Within-class scatter matrix
Between-class scatter matrix
projection matrix
- LDA computes a transformation that maximizes the
between-class scatter while minimizing the
within-class scatter
products of eigenvalues !
scatter matrices of the projected data y
15Linear Discriminant Analysis (3/6)
- If Sw is non-singular, we can obtain a
conventional eigenvalue problem by writing
- In practice, Sw is often singular since the data
are image vectors with large dimensionality while
the size of the data set is much smaller (M ltlt N )
- c.f. Since Sb has at most rank C-1, the max
number of eigenvectors with non-zero eigenvalues
is C-1 (i.e., max dimensionality of sub-space is
C-1)
16Linear Discriminant Analysis (4/6)
- Does Sw-1 always exist? cont.
- To alleviate this problem, we can use PCA first
- PCA is first applied to the data set to reduce
its dimensionality.
- LDA is then applied to find the most
discriminative directions
17Linear Discriminant Analysis (5/6)
PCA
LDA
D. Swets, J. Weng, "Using Discriminant
Eigenfeatures for Image Retrieval", IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 18, no. 8, pp. 831-836, 1996
18Linear Discriminant Analysis (6/6)
- Factors unrelated to classification
- MEF vectors show the tendency of PCA to capture
major variations in the training set such as
lighting direction. - MDF vectors discount those factors unrelated to
classification.
D. Swets, J. Weng, "Using Discriminant
Eigenfeatures for Image Retrieval", IEEE
Transactions on Pattern Analysis and Machine
Intelligence, vol. 18, no. 8, pp. 831-836, 1996
19INDEPENDENT COMPONENT ANALYSIS
20PCA vs ICA
- PCA
- Focus on uncorrelated and Gaussian components
- Second-order statistics
- Orthogonal transformation
- ICA
- Focus on independent and non-Gaussian components
- Higher-order statistics
- Non-orthogonal transformation
21(No Transcript)
22Independent Component Analysis (1/5)
- Concept of ICA
- A given signal(x) is generated by linear
mixing(A) of independent components(s) - ICA is a statistical analysis method to estimate
those independent components(z) and Mixing rule(W)
x1 x2 x3 xM
z1 z2 z3 zM
Wij
Aij
s1 s2 s3 sM
- We do not know
- Both unknowns
- Some optimization
- Function is required!!
s A x
W z
23Independent Component Analysis (2/5)
24Independent Component Analysis(3/5)
- What is independent component??
- If one variable can not be estimated from other
variables, it is independent. - By Central Limit Theorem, a sum of two
independent random variables is more gaussian
than original variables ? distribution of
independent components are nongaussian - To estimate ICs, z should have nongaussian
distribution, i.e. we should maximize
nonguassianity.
25Independent Component Analysis(4/5)
- What is nongaussianity?
- Supergaussian
- Subgaussian
- Low entropy
Gaussian
Supergaussian
Subgaussian
26Independent Component Analysis(5/5)
- Measuring nongaussianity by Kurtosis
- Kurtosis 4th order cumulant of randomvariable
- If kurt(z) is zero, gaussian
- If kurt(z) is positive, supergaussian
- If kurt(z) is negative, subgaussian
- Maximzation of kurt(z) by gradient method
Simply change The norm of w
Fast-fixed point algorithm
27PCA vs LDA vs ICA
- PCA Proper to dimension reduction
- LDA Proper to pattern classification if the
number of training samples of each class are
large - ICA Proper to blind source separation or
classification using ICs when class id of
training data is not available
28References
- Simon Haykin, Neural Networks A Comprehensive
Foundation- 2nd Edition, Prentice Hall - Marian Stewart Bartlett, Face Image Analysis by
Unsupervised Learning, Kluwer academic
publishers - A. Hyvärinen, J. Karhunen and E. Oja,
Independent Component Analysis,, John Willy
Sons, Inc. - D. L. Swets and J. Weng, Using Discriminant
Eigenfeatures for Image Retrieval, IEEE
Trasaction on Pattern Analysis and and Machine
Intelligence, Vol. 18, No. 8, August 1996