Title: Exploratory factor analysis
1Exploratory factor analysis
2EGO GHQ-12 EFA
- 1) Assuming items are continuous
- Variable
- Names are
- ghq01 ghq02 ghq03 ghq04 ghq05 ghq06
- ghq07 ghq08 ghq09 ghq10 ghq11 ghq12
- f1 id
- Missing are all (-9999)
- usevariables are
- ghq01 ghq03 ghq05 ghq07 ghq09 ghq11
- ghq02 ghq04 ghq06 ghq08 ghq10 ghq12
-
- idvariable id
- Analysis
- Type EFA 1 3
2) Assuming items are categorical Variable
Names are ghq01 ghq02 ghq03 ghq04 ghq05
ghq06 ghq07 ghq08 ghq09 ghq10 ghq11 ghq12
f1 id Missing are all (-9999)
usevariables are ghq01 ghq03 ghq05 ghq07
ghq09 ghq11 ghq02 ghq04 ghq06 ghq08 ghq10
ghq12 categorical are ghq01 ghq03
ghq05 ghq07 ghq09 ghq11 ghq02 ghq04 ghq06
ghq08 ghq10 ghq12 idvariable
id Analysis Type EFA 1 3
3EGO GHQ-12 EFA
- 1) Assuming items are continuous
- EIGENVALUES
- FOR SAMPLE CORRELATION MATRIX
-
- 1 2 3 4
- 6.277 1.072 0.803 0.597
-
- 5 6 7 8
- 0.565 0.497 0.460 0.445
-
- 9 10 11 12
- 1 0.375 0.365 0.319 0.225
-
2) Assuming items are categorical EIGENVALUES
FOR SAMPLE CORRELATION MATRIX 1
2 3 4 1 7.05 1.107
0.79 0.534 5 6 7
8 1 0.489 0.43 0.365
0.349 9 10 11
12 1 0.289 0.258 0.212 0.128
4EGO GHQ-12 EFA
- 1) Assuming items are continuous
- PROMAX ROTATED LOADINGS
- 1 2
- ________ ________
- GHQ01 0.416 0.333
- GHQ03 0.727 -0.089
- GHQ05 -0.009 0.710
- GHQ07 0.369 0.348
- GHQ09 -0.013 0.871
- GHQ11 0.336 0.460
- GHQ02 -0.089 0.723
- GHQ04 0.816 -0.086
- GHQ06 0.240 0.569
- GHQ08 0.493 0.282
- GHQ10 0.229 0.627
- GHQ12 0.364 0.457
-
2) Assuming items are categorical PROMAX
ROTATED LOADINGS 1
2 ________ ________ GHQ01
0.529 0.285 GHQ03 0.787
-0.098 GHQ05 0.045 0.718
GHQ07 0.530 0.268 GHQ09
0.069 0.838 GHQ11 0.090
0.780 GHQ02 -0.046 0.732 GHQ04
0.859 -0.059 GHQ06
0.230 0.625 GHQ08 0.527
0.298 GHQ10 0.068 0.842 GHQ12
0.428 0.453 PROMAX FACTOR
CORRELATIONS 1 2
1 1.000 2 0.668
1.000
5Item residual variances
6Correlation -v- regression coefficient
Correlation coefficient The interdependence
between pairs of variables i.e. the extent to
which values of the variable change together
The strength and direction of the linear
relationship A fatter ellipse will result in a
greater degree of scatter for a regression line
of a given gradient, and a lower correlation
7Polychoric Correlation - Assumptions
- A binary or categorical variable is the observed
(or manifest) part of an underlying (or latent)
continuous variable - Here well also assume that latent variables are
normally distributed - THRESHOLD relates the manifest to the latent
variable - Uebersax link http//ourworld.compuserve.com/home
pages/jsuebersax/tetra.htm
8Thresholds
Figure from Uebersax webpage
92 binary variables
- . tab sumodd_g sumeven_g
- sumeven_g
- sumodd_g 0 1 Total
- -------------------------------------------
- 0 896 20 916
- 1 61 142 203
- -------------------------------------------
- Total 957 162 1,119
- This is all we see, however .
10 this is what we assume is going on
Figure from Uebersax webpage
11- What we are really interested in is the
correlation (r) between the continuous latent
variables - Computer algorithm used to search for a
correlation r and thresholds t1 and t2 which best
reproduce the cell counts of the 2x2 table
12Conclusions
- EFA can be carried out in Mplus very simply
- We have demonstrated that it can be dangerous to
ignore the ordinal nature of the data when
fitting such a model (a practice followed by
many!)