Title: Proactive Monte Carlo Analysis in Structural Equation Modeling
1Proactive Monte Carlo Analysis in Structural
Equation Modeling
- James H. Steiger
- Vanderbilt University
2Some Unhappy Scenarios
- A Confirmatory Factor Analysis
- You fit a 3 factor model to 9 variables with
N150 - You obtain a Heywood Case
- Comparing Two Correlation Matrices
- You wish to test whether two population matrices
are equivalent, using ML estimation - You obtain an unexpected rejection
3Some Unhappy Scenarios
- Fitting a Trait-State Model
- You fit the Kenny-Zautra TSE model to 4 waves of
panel data with N200. You obtain a variance
estimate of zero. - Writing a Program Manual
- You include an example analysis in your widely
distributed computer manual - The analysis remains in your manuals for more
than a decade - The analysis is fundamentally flawed, and gives
incorrect results
4Some Common Elements
- Models of covariance or correlation structure
- Potential problems could have been identified
before data were ever gathered, using proactive
Monte Carlo analysis
5Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3
VIS_PERC X
CUBES X
LOZENGES X
PAR_COMP X
SEN_COMP X
WRD_MNG X
ADDITION X
CNT_DOT X
ST_CURVE X
6Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.46 0.79
CUBES 0.65 0.58
LOZENGES 0.25 0.94
PAR_COMP 1.00 0.00
SEN_COMP 0.41 0.84
WRD_MNG 0.22 0.95
ADDITION 0.38 0.85
CNT_DOT 1.00 0.00
ST_CURVE 0.30 0.91
7Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.60 0.64
CUBES 0.60 0.64
LOZENGES 0.60 0.64
PAR_COMP 0.60 0.64
SEN_COMP 0.60 0.64
WRD_MNG 0.60 0.64
ADDITION 0.60 0.64
CNT_DOT 0.60 0.64
ST_CURVE 0.60 0.64
8Proactive Monte Carlo Analysis
- Take the model you anticipate fitting
- Insert reasonable parameter values
- Generate a population covariance or correlation
matrix and fit this matrix, to assess
identification problems - Examine Monte Carlo performance over a range of
sample sizes that you are considering - Assess convergence problems, frequency of
improper estimates, Type I Error, accuracy of fit
indices - Preliminary investigations may take only a few
hours
9Confirmatory Factor Analysis
(Speed)-1.3-gtVIS_PERC (Speed)-2.4-gtCUBES
(Speed)-3.5-gtLOZENGES (Verbal)-4.6-gtPAR
_COMP (Verbal)-5.3-gtSEN_COMP
(Verbal)-6.4-gtWRD_MNG (Visual)-7.5-gtADDIT
ION (Visual)-8.6-gtCNT_DOT
(Visual)-9.3-gtST_CURVE
10Confirmatory Factor Analysis
11Confirmatory Factor Analysis
12Confirmatory Factor Analysis
13Confirmatory Factor Analysis
(Speed)-1.53-gtVIS_PERC (Speed)-2.54-gtCUBE
S (Speed)-3.55-gtLOZENGES
(Verbal)-4.6-gtPAR_COMP (Verbal)-5.3-gtSEN_C
OMP (Verbal)-6.4-gtWRD_MNG
(Visual)-7.5-gtADDITION (Visual)-8.6-gtCNT_D
OT (Visual)-9.3-gtST_CURVE
14Confirmatory Factor Analysis
15Confirmatory Factor Analysis
16Confirmatory Factor Analysis
17Confirmatory Factor Analysis
Variable Factor 1 Factor 2 Factor 3 Unique Var.
VIS_PERC 0.60 0.64
CUBES 0.60 0.64
LOZENGES 0.60 0.64
PAR_COMP 0.60 0.64
SEN_COMP 0.60 0.64
WRD_MNG 0.60 0.64
ADDITION 0.60 0.64
CNT_DOT 0.60 0.64
ST_CURVE 0.60 0.64
18Proactive Monte Carlo Analysis
19Proactive Monte Carlo Analysis
20Proactive Monte Carlo Analysis
21Proactive Monte Carlo Analysis
22Percentage of Heywood Cases
N Loading .4 Loading .6 Loading .8
75 80 30 0
100 78 11 0
150 62 3 0
300 21 0 0
500 01 0 0
23Standard Errors
24Standard Errors
25Standard Errors
26Distribution of Estimates
27Standard Errors (N 300)
28Standard Errors (N 300)
29Distribution of Estimates
30Correlational Pattern Hypotheses
- Pattern Hypothesis
- A statistical hypothesis that specifies that
parameters or groups of parameters are equal to
each other, and/or to specified numerical values - Advantages of Pattern Hypotheses
- Only about equality, so they are invariant under
nonlinear monotonic transformations (e.g., Fisher
Transform).
31Correlational Pattern Hypotheses
- Caution! You cannot use the Fisher transform to
construct confidence intervals for differences of
correlations - For an example of this error, see Glass and
Stanley (1970, p. 311-312).
32Comparing Two Correlation Matrices in Two
Independent Samples
- Jennrich (1970)
- Method of Maximum Likelihood (ML)
- Method of Generalized Least Squares (GLS)
- Example
- Two 11x11 matrices
- Sample sizes of 40 and 89
33Comparing Two Correlation Matrices in Two
Independent Samples
- ML Approach
- Minimizes ML discrepancy function
- Can be programmed with standard SEM software
packages that have multi-sample capability
34Comparing Two Correlation Matrices in Two
Independent Samples
- Generalized Least Squares Approach
- Minimizes GLS discrepancy function
- SEM programs will iterate the solution
- Freeware (Steiger, 2005, in press) will perform
direct analytic solution
35Monte Carlo Results Chi-Square Statistic
Mean S.D.
Observed 75.8 13.2
Expected 66 11.5
36Monte Carlo Results Distribution of p-Values
37Monte Carlo Results Distribution of Chi-Square
Statistics
38Monte Carlo Results (ML) Empirical vs. Nominal
Type I Error Rate
Nominal a .010 .050
Empirical a .076 .208
39Monte Carlo Results (ML)Empirical vs. Nominal
Type I Error RateN 250 per Group
Nominal a .010 .050
Empirical a .011 .068
40Monte Carlo Results Chi-Square Statistic, N
250 per Group
Mean S.D.
Observed 67.7 11.6
Expected 66 11.5
41Kenny-Zautra TSE Model
42Likelihood of Improper Values in the TSE Model
43Constraint Interaction
- Steiger, J.H. (2002). When constraints interact
A caution about reference variables,
identification constraints, and scale
dependencies in structural equation modeling.
Psychological Methods, 7, 210-227.
44Constraint Interaction
45Constraint Interaction
46Constraint Interaction
47Constraint Interaction
48Constraint Interaction Model without ULI
Constraints (Constrained Estimation)
- (XI1)-1-gtX1
- (XI1)-2-gtX2
- (XI1)-1-(XI1)
- (DELTA1)--gtX1
- (DELTA2)--gtX2
- (DELTA1)-3-(DELTA1)
- (DELTA2)-4-(DELTA2)
- (ETA1)-98-gtY1
- (ETA1)-5-gtY2
- (ETA2)-99-gtY3
- (ETA2)-6-gtY4
- (EPSILON1)--gtY1
49Constraint Interaction
50Constraint Interaction
51Constraint Interaction Model With ULI
Constraints
- (XI1)--gtX1
- (XI1)-2-gtX2
- (XI1)-1-(XI1)
- (DELTA1)--gtX1
- (DELTA2)--gtX2
- (DELTA1)-3-(DELTA1)
- (DELTA2)-4-(DELTA2)
- (ETA1)--gtY1
- (ETA1)-5-gtY2
- (ETA2)--gtY3
- (ETA2)-6-gtY4
- (EPSILON1)--gtY1
- (EPSILON2)--gtY2
52Constraint Interaction Model With ULI
Constraints
53Typical Characteristics of Statistical Computing
Cycles
- Back-loaded
- Occur late in the research cycle, after data are
gathered - Reactive
- Often occur in support of analytic activities
that are reactions to previous analysis results
54Traditional Statistical World-View
- Data come first
- Analyses come second
- Analyses are well-understood and will work
- Before the data arrive, there is nothing to
analyze and no reason to start analyzing
55Modern Statistical World View
- Planning comes first
- Power Analysis, Precision Analysis, etc.
- Planning may require some substantial computing
- Goal is to estimate required sample size
- Data analysis must wait for data
56Proactive SEM Statistical World View
- SEM involves interaction between specific
model(s) and data. - Some models may not work with many data sets
- Planning involves
- Power Analysis
- Precision Analysis
- Confirming Identification
- Proactive Analysis of Model Performance
- Without proper proactive analysis, research can
be stopped cold with an unhappy surprise.
57Barriers
- Software
- Design
- Availability
- Education