Title: Principal Component Analysis: Preliminary Studies
1Principal Component Analysis Preliminary Studies
- Émille E. O. Ishida
- IF - UFRJ
- First Rio-Saclay Meeting Physics Beyond the
Standard Model - Rio de Janeiro - dec/2006
2The main objective of
Statistics
Simplification
Physics
Science
Statistics is the art of extracting simple
comprehensible facts that tell us what we want
to know for practical reasons
Principal Component Analysis (PCA) is a tool
for simplifying one particular class of
data......
3For example...
How this parameters are related to each other?
n objects and p things we know about them...
n6 objects and p4 things we know about them...
-height -n publications -flier miles -fuel
consumption
-height -n publications -flier miles -fuel
consumption
-height -n publications -flier miles -fuel
consumption
-height -n publications -flier miles -fuel
consumption
-height -n publications -flier miles -fuel
consumption
-height -n publications -flier miles -fuel
consumption
4For example...
Do people who spend most of their lives in
airports publish more?
Do people with inefficient cars fly more.....
or just the ones with lots of publications do?
Do these correlations represent any real causal
connection?
or..... once you buy a car, stop publishing and
give lots of talks in exotic foreign
locations?
5First try
Plot everything against everything else...
...as the number of parameters increases this
becomes impossibly complicated!
PCA looks for sets of parameters that always
correlate togheter
The first application of PCA was in social
science....
Ex give a sample of n people a set of p exams
testing their creativity, memory, math
skills.... And look for correlations..... Result
nearly all tests correlates to each other,
indicating that one underlying variable could
predict the performances in all tests
IQ.....an infamous begginig...!!
6General Idea
Given a sample of n objects p measured
quantities - xi (i1,2,3,....,p)
Find a new set of p orthogonal variables (xi ,
... xp) each a linear combination of the
original ones
Principal Components
Determine aij such that the smallest number of
new variables account for as much of the sample
variance as possible.
7Basic Statistics
Variance
Mean Value
Covariance
http//csnet.otago.ac.nz/cosc453/student_tutorials
/principal_components.pdf
8Covariance Matrix in 2-D
Eigenvectors ? New axes (new uncorrelated
variables) Eigenvalues ? variances in the
direction of the Principal Components
The largest eigenvalue ? First Principal
Component
http//csnet.otago.ac.nz/cosc453/student_tutorials
/principal_components.pdf
9But.....thats not our case....
We want to make inferences about a model using a
sample of data....
Parameter Estimation
Consistency Bias Efficiency Robusteness
http//pdg.lbl.gov/)
10The Method of Maximum Likelihood
http//pdg.lbl.gov/)
11For an unbiased estimator....
We can calculate the covariance between the
parameters of the theory
Fisher Matrix
http//pdg.lbl.gov/)
12 What about Cosmology?
Direct evidence for an accelerated expansion
Can we get information out of SN Ia observations
without the assumption of General Relativity?
13Definitions....
14As proposed by Shapiro Turner (2006)...
Gaussian probability distribution in each bin...
- Dz 0.05
- Data from Gold Sample
- (Riess et al.)
Modulus Distance
15The Fisher Matrix
Observation about s...
16PC4
PC1
PC2
PC5
PC3
PC6
17Reconstruction of q(z)
We need more data!
18Next Steps....
Small corrections in the present
code (optimization)
Change the observable
Get used to this procedure and be able to handle
large data sets in a model independent way
19References
- - D. Huterer e G. Starkman, Parametrization of
dark energy properties A Principal-Component
Approach, Physical Review Letters, 90 (3),
Janeiro/2003 - C. Shapiro e M. S. Turner, What do we really know
about cosmic acceleration?, arXivastro-ph/0512586
- G. Cowan, Statistical Data Analysis, Clarendon
Press, Oxford (1998) - P. J. Francis and B. J. Wills, Introduction to
Principal Component Analysis, arXiv
astro-ph/9905079 - W.-M. Yao et al., Journal of Physics G 33, 1
(2006) - available on the PDG WWW pages (URL
http//pdg.lbl.gov/) -
20Shapiro Turner (2006) Principal Components