Title: Principal Components
1Principal Components
2 3(No Transcript)
4Principal Components (PC)
- Objective Given a data matrix of dimensions nxp
(p variables and n elements) try to represent
these data by using r variables (rltp) with
minimum lost of information
5- We want to find a new set of p variables, Z,
which are linear combinations of the original X
variable such that - r of them contains all the information
- The remaining p-r variables are noise
6First interpretation of principal components
Optimal Data Representation
7Proyection of a point in direction a minimize
the squared distance Implies maximizing the
variance (assuming zero mean variables)
ri
xi
zi
xiT xi riT ri zTi zi
a
8(No Transcript)
9(No Transcript)
10(No Transcript)
11Optimal Prediction
Second interpretation of PC
Find a new variable zi aXi which is optimal to
predict The value of Xi in each element . In
general, find r variables, zi Ar Xi , which are
optimal to forecast All Xi with the least
squared error criterion
It is easy to see that the solution is that zi
aXi must have maximum variance
12Third interpretation of PC Find the optimal
direction to represent the data. Axe of the
ellipsoid which contains the data
The line which minimizes the orthogonal distance
provides the axes of the ellipsoid
This is idea of Pearson orthogonal regression
13(No Transcript)
14(No Transcript)
15(No Transcript)
16Second component
17(No Transcript)
18(No Transcript)
19(No Transcript)
20(No Transcript)
21Properties of PC
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27Standardized PC
28(No Transcript)
29Example Inves
30Example Inves
31(No Transcript)
32Example Medifis
33(No Transcript)
34(No Transcript)
35Example mundodes
36Example Mundodes
37Example for image analysis
38(No Transcript)
39The analysis have been done with 16 images. PC
allows that Instead of sending 16 matrices of N2
pixels
we send a vector 16x3 with the values of the
components and a matrix 3xN2 with the values of
the new variables. We save
If instead of 16 images we have 100 images we
save 95
40(No Transcript)
41(No Transcript)
42(No Transcript)