Factor Analysis and Principal Components

About This Presentation

Title:

Factor Analysis and Principal Components

Description:

A typical use of factor analysis is in survey research, where a ... Recommend Keep defaults but also check 'Scree plot'. Example of Factor Analysis: Rotation ... – PowerPoint PPT presentation

Number of Views:1182

Avg rating:3.0/5.0

Slides: 53

Provided by: asNi7

Category:

more less

Transcript and Presenter's Notes

Title: Factor Analysis and Principal Components

1
Factor Analysis and Principal Components

Factor analysis with principal components
presented as a subset of factor analysis
techniques, which it is subset.

2
A Reference

The following 13 slides comes from
Multivariate Data Analysis Using SPSS
By John Zhang
ARL, IUP

3
Factor Analysis-1

The main goal of factor analysis is data
reduction. A typical use of factor analysis is in
survey research, where a researcher wishes to
represent a number of questions with a smaller
number of factors
Two questions in factor analysis
How many factors are there and what they
represent (interpretation)
Two technical aids
Eigenvalues
Percentage of variance accounted for

4
Factor Analysis-2

Two types of factor analysis
Exploratory introduce here
Confirmatory SPSS AMOS
Theoretical basis
Correlations among variables are explained by
underlying factors
An example of mathematical 1 factor model for two
variables
V1L1F1E1
V2L2F1E2

5
Factor Analysis-3

Each variable is compose of a common factor (F1)
multiply by a loading coefficient (L1, L2 the
lambdas or factor loadings) plus a random
component
V1 and V2 correlate because the common factor and
should relate to the factor loadings, thus, the
factor loadings can be estimated by the
correlations
A set of correlations can derive different factor
loadings (i.e. the solutions are not unique)
One should pick the simplest solution

6
Factor Analysis-4
That is the findings should not differ by
methodology of analysis nor by sample

A factor solution needs to confirm
By a different factor method
By a different sample
More on terminology
Factor loading interpreted as the Pearson
correlation between the variable and the factor
Communality the proportion of variability for a
given variable that is explained by the factor
Extraction the process by which the factors are
determined from a large set of variables

7
Factor Analysis-5 (Principle components)

Principle component one of the extraction
methods
A principle component is a linear combination of
observed variables that is independent
(orthogonal) of other components
The first component accounts for the largest
amount of variance in the input data the second
component accounts for the largest amount or the
remaining variance
Components are orthogonal means they are
uncorrelated

8
Factor Analysis-6 (Principle components)

Possible application of principle components
E.g. in a survey research, it is common to have
many questions to address one issue (e.g.
customer service). It is likely that these
questions are highly correlated. It is
problematic to use these variables in some
statistical procedures (e.g. regression). One can
use factor scores, computed from factor loadings
on each orthogonal component

9
Factor Analysis-7 (Principle components)

Principle component vs. other extract methods
Principle component focus on accounting for the
maximum among of variance (the diagonal of a
correlation matrix)
Other extract methods (e.g. principle axis
factoring) focus more on accounting for the
correlations between variables (off diagonal
correlations)
Principle component can be defined as a unique
combination of variables but the other factor
methods can not
Principle component are use for data reduction
but more difficult to interpret

10
Factor Analysis-8

Number of factors
Eigenvalues are often used to determine how many
factors to take
Take as many factors there are eigenvalues
greater than 1
Eigenvalue represents the amount of standardized
variance in the variable accounted for by a
factor
The amount of standardized variance in a variable
is 1
The sum of eigenvalues is the percentage of
variance accounted for

11
Factor Analysis-9

Rotation
Objective to facilitate interpretation
Orthogonal rotation done when data reduction is
the objective and factors need to be orthogonal
Varimax attempts to simplify interpretation by
maximize the variances of the variable loadings
on each factor
Quartimax simplify solution by finding a
rotation that produces high and low loadings
across factors for each variable
Oblique rotation use when there are reason to
allow factors to be correlated
Oblimin and Promax (promax runs fast)

12
Factor Analysis-10

Factor scores if you are satisfy with a factor
solution
You can request that a new set of variables be
created that represents the scores of each
observation on the factor (difficult of
interpret)
You can use the lambda coefficient to judge which
variables are highly related to the factor the
compute the sum of the mean of this variables for
further analysis (easy to interpret)

13
Factor Analysis-11

Sample size the sample size should be about 10
to 15 times the number of variables (as other
multivariate procedures)
Number of methods there are 8 factoring methods,
including principle component
Principle axis account for correlations between
the variables
Unweighted least-squares minimize the residual
between the observed and the reproduced
correlation matrix

14
Factor Analysis-12

Generalize least-squares similar to Unweighted
least-squares but give more weight the the
variables with stronger correlation
Maximum Likelihood generate the solution that is
the most likely to produce the correlation matrix
Alpha Factoring Consider variables as a sample
not using factor loadings
Image factoring decompose the variables into a
common part and a unique part, then work with the
common part

15
Factor Analysis-13

Recommendations
Principle components and principle axis are the
most common used methods
When there are multicollinearity, use principle
components
Rotations are often done. Try to use Varimax

16
Reference

Factor Analysis from SPSS
Much of the wording comes from the SPSS help and
tutorial.

17
Factor Analysis

Factor Analysis is primarily used for data
reduction or structure detection.
The purpose of data reduction is to remove
redundant (highly correlated) variables from the
data file, perhaps replacing the entire data file
with a smaller number of uncorrelated variables.
The purpose of structure detection is to examine
the underlying (or latent) relationships between
the variables.

18
Factor Analysis

The Factor Analysis procedure has several
extraction methods for constructing a solution.
For Data Reduction. The principal components
method of extraction begins by finding a linear
combination of variables (a component) that
accounts for as much variation in the original
variables as possible. It then finds another
component that accounts for as much of the
remaining variation as possible and is
uncorrelated with the previous component,
continuing in this way until there are as many
components as original variables. Usually, a few
components will account for most of the
variation, and these components can be used to
replace the original variables. This method is
most often used to reduce the number of variables
in the data file.
For Structure Detection. Other Factor Analysis
extraction methods go one step further by adding
the assumption that some of the variability in
the data cannot be explained by the components
(usually called factors in other extraction
methods). As a result, the total variance
explained by the solution is smaller however,
the addition of this structure to the factor
model makes these methods ideal for examining
relationships between the variables.
With any extraction method, the two questions
that a good solution should try to answer are
"How many components (factors) are needed to
represent the variables?" and "What do these
components represent?"

19
Factor Analysis Data Reduction

An industry analyst would like to predict
automobile sales from a set of predictors.
However, many of the predictors are correlated,
and the analyst fears that this might adversely
affect her results.
This information is contained in the file
car_sales.sav . Use Factor Analysis with
principal components extraction to focus the
analysis on a manageable subset of the
predictors.

20
Factor Analysis Structure Detection

A telecommunications provider wants to better
understand service usage patterns in its customer
database. If services can be clustered by usage,
the company can offer more attractive packages to
its customers.
A random sample from the customer database is
contained in telco.sav . Factor Analysis to
determine the underlying structure in service
usage.
Use Principal Axis Factoring

21
Example of Factor Analysis Structure Detection
Telecommunications provider wants to better
understand service usage patterns in its customer
database. Selecting service offerings
22
Example of Factor Analysis Descriptives
Click descriptives Recommend checking Initial
Solution (default) In addition, check
Anti-image and KMO and .
23
Example of Factor Analysis Extraction
Click Extraction Select Method Principal axis
factoring. Recommend Keep defaults but also
check Scree plot.
24
Example of Factor Analysis Rotation
Click Rotation Select Varimax and Loading
plot(s).
25
Understanding the Output
The Kaiser-Meyer-Olkin Measure of Sampling
Adequacy is a statistic that indicates the
proportion of variance in your variables that
might be caused by underlying factors. Perhaps
cant use factor analys if lt0.5
Bartlett's test of sphericity tests the
hypothesis that your correlation matrix is an
identity matrix, which would indicate that your
variables are unrelated and therefore unsuitable
for structure detection. Sig. lt0.05 than factor
analysis may be helpful.
26
Understanding the Output
Extraction communalities are estimates of the
variance in each variable accounted for by the
factors in the factor solution. Small values
indicate variables that do not fit well with the
factor solution, and should possibly be dropped
from the analysis. The lower values of Multiple
lines and Calling card show that they don't fit
as well as the others.
27
Understanding the Output
Before rotation
Only three factors in the initial solution have
eigenvalues greater than 1. Together, they
account for almost 65 of the variability in the
original variables. This suggests that three
latent influences are associated with service
usage, but there remains room for a lot of
unexplained variation.
28
Understanding the Output
After rotation
From rotation approximately now 56 of the
variation is explained about a 10 loss in
explanation of the variation.
29
Understanding the Output
In general, there are a lot of services that have
correlations greater than 0.2 with multiple
factors, which muddies the picture. The rotated
factor matrix should clear this up.
Before rotation
The relationships in the unrotated factor matrix
are somewhat clear. The third factor is
associated with Long distance last month. The
second corresponds most strongly to Equipment
last month, Internet, and Electronic billing. The
first factor is associated with Toll free last
month, Wireless last month, Voice mail, Paging
service, Caller ID, Call waiting, Call
forwarding, and 3-way calling.
30
Understanding the Output
After rotation
The first rotated factor is most highly
correlated with Toll free last month, Caller ID,
Call waiting, Call forwarding, and 3-way calling.
These variables are not particularly correlated
with the other two factors. The second factor is
most highly correlated with Equipment last month,
Internet, and Electronic billing. The third
factor is largely unaffected by the rotation.
31
Understanding the Output
Thus, there are three major groupings of
services, as defined by the services that are
most highly correlated with the three factors.
Given these groupings, you can make the following
observations about the remaining services
Because of their moderately large correlations
with both the first and second factors, Wireless
last month, Voice mail, and Paging service bridge
the "Extras" and "Tech" groups. Calling card last
month is moderately correlated with the first and
third factors, thus it bridges the "Extras" and
"Long Distance" groups. Multiple lines is
moderately correlated with the second and third
factors, thus it bridges the "Tech" and "Long
Distance" groups. This suggests avenues for
cross-selling. For example, customers who
subscribe to extra services may be more
predisposed to accepting special offers on
wireless services than Internet services.
32
Summary What Was Learned

Using a principal axis factors extraction, you
have uncovered three latent factors that describe
relationships between your variables. These
factors suggest various patterns of service
usage, which you can use to more efficiently
increase cross-sales.

33
Using Principal Components

Principal Components can aid in clustering.
What is principal components?
Principal is a statistical technique that creates
new variables that are linear functions of the
old variables. The main goal of principal
components is to to reduce the number of
variables needed to analyze.

34
Principal Components Analysis (PCA)

What it is and when it should be used.

35
Introduction to PCA

What does principal components analysis do?
Takes a set of correlated variables and creates a
smaller set of uncorrelated variables.
These newly created variables are called
principal components.
There are two main objectives for using PCA
Reduce the dimensionality of the data.
In simple English turn p variables into less
than p variables.
While reducing the number of variables we attempt
to keep as much information of the original
variables as possible.
Thus we try to reduce the number of variables
without loss of information.
Identify new meaningful underlying variables.
This is often not possible.
The principal components created are linear
combinations of the original variables and often
dont lend to any meaning beyond that.
There are several reasons why and situations
where PCA is useful.

36
Introduction to PCA

There are several reasons why PCA is useful.
PCA is helpful in discovering if abnormalities
exist in a multivariate dataset.
Clustering (which will be covered later)
PCA is helpful when it is desirable to classify
units into groups with similar attributes.
For example In marketing you may want to
classify your customers into groups (or clusters)
with similar attributes for marketing purposes.
It can also be helpful for verifying the clusters
created when clustering.
Discriminant analysis
In some cases there may be more response
variables than independent variables. It is not
possible to use discriminant analysis in this
case.
Principal components can help reduce the number
of response variables to a number less than that
of the independent variables.
Regression
It can help address the issue of multicolinearity
in the independent variables.

37
Introduction to PCA

Formation of principal components
They are uncorrelated
The 1st principal component accounts for as much
of the variability in the data as possible.
The 2nd principal component accounts for as much
of the remaining variability as possible.
The 3rd
Etc.

38
Principal Components and Least Squares

Think of the Least Squares model

Eigenvector ltmathematicsgt A vector which, when
acted on by a particular linear transformations,
produces a scalar multiple of theoriginal
vector. The scalar in question is called
theeigenvalue corresponding to this eigenvector.
www.dictionary.com

39
Calculation of the PCA

There are two options
Correlation matrix.
Covariance matrix.
Using the covariance matrix will cause variables
with large variances to be more strongly
associated with components with large eigenvalues
and the opposite is true of variables with small
variances.
For the above reason you should use the
correlation matrix unless the variables are
comparable or have been standardized.

40
Limitations to Principal Components

PCA converts a set of correlated variables into a
smaller set of uncorrelated variables.
If the variables are already uncorrelated than
PCA has nothing to add.
Often it is difficult to impossible to explain a
principal component. That is often principal
components do not lend themselves to any meaning.

41
SAS Example of PCA

We will analyze data on crime.
CRIME RATES PER 100,000 POPULATION BY STATE.
The variables are
MURDER
RAPE
ROBBERY
ASSAULT
BURGLARY
LARCENY
AUTO
SAS CODE
PROC PRINCOMP DATACRIME OUTCRIMCOMP
run

SAS command for PCA
The dataset is CRIME and results will be saved to
CRIMCOMP
42
SAS Output Of Crime Example
43
More SAS Output Of Crime Example
0.097983420.22203947 - 0.12045606
The first two principal components captures
76.48 of the variation.
If you include 6 of the 7 principal components
you capture 98.23 of the variability. The 7th
component only captures 1.77.
The proportion of variability explained by each
principal component individually. This value
equals the Eigenvalue/(sum of the Eigenvalues).
44
More SAS Output Of Crime Example
Prin1 has all positive values. This variable can
be used as a proxy for overall crime rate.
Prin2 has positive and negative values. Murder,
Rape, and Assault are all negative (Violent
Crimes). Robbery, Burglary, Larceny, and Auto are
all positive (Property). This variable can be
used for an understanding of Property vs. Violent
crime.
45
CRIME RATES PER 100,000 POPULATION BY
STATESTATES LISTED IN ORDER OF OVERALL CRIME
RATE AS DETERMINED BY THE FIRST PRINCIPAL
COMPONENTLowest 10 States and Then theTop 10
States
46
CRIME RATES PER 100,000 POPULATION BY
STATE.STATES LISTED IN ORDER OF PROPERTY VS.
VIOLENT CRIME AS DETERMINED BY THE SECOND
PRINCIPAL COMPONENTLowest 10 States and Then
theTop 10 States
47
Correlation From SAS First the Descriptive
Statistics (A part of the output from
Correlation)
48
Correlation Matrix
49
Correlation Matrix Just the Variables
Note that there is correlation among the crime
rates.
50
Correlation Matrix Just the Principal Components
Note that there is no correlation among the
principal components.
51
Correlation Matrix Just the Principal Components
Note the higher/very high correlations with the
1st few principal components and it decreases as
it goes closer to the last principal component.
52
What If We Told SAS to Produce Only 2 Principal
Components?
The 2 principal components produced when it is
asked to produce only 2 principal components are
exactly the same for when it produced all.

Write a Comment

User Comments (0)