Title: Introduction to Structural Equation Modeling with LISREL.
1Introduction to Structural Equation Modeling with
LISREL.
- E. Kevin Kelloway, Ph.D.
- Professor of Management and Psychology
- Senior Research Fellow, CN Centre for
Occupational Health and Safety
2What is a theory?
- Statement of causal relations
- Implies a pattern of covariances/correlations
- Necessary (but not sufficient) condition for
validity is that the oberved pattern of
correlations matches the implied pattern of
correlations. - Fundamental hypothesis of all SEM applications
- ?(?)
3Fishbein Ajzens Theory of Reasoned Action
Behavioral Intentions
Beliefs
Attitudes
Behavior
r.4
r.26
Subjective Norms
4THE SEM PROCESS
- Model Specification
- Identification
- Estimation
- Testing Fit
- Respecification
5Model Specification ? ?(?)
6Causality
- Association
- Isolation
- Causal Direction
7PATH DIAGRAMS
- Causal flow is from left to right (top to
bottom). - Curved arrows represent bidirectional
relationships (correlations). - Straight arrows represent causal associations
- Relationships assumed to be linear
- Whats not in the model is just as important as
what is in the model - Causal Closure
8Path Diagram
X
Q
Z
Y
9Factor AnalysisY t e
F1
F2
Y3
Y5
Y6
Y1
Y2
Y4
E4
E5
E6
E2
E3
E1
10 IdentificationX Y 10Solve for X
11Types of Models
- Just Identified (e.g., regression or multiple
regression) - Under Identified
- Over-Identified
- The t rule, given a k X k matrix there are
- k X (k-1)/2 elements that can be estimated
12Identifying Restrictions
- Direction recursive models
- Assigning value to parameters (often 0)
13 Estimation
14Estimation
- Iterative estimation to a fitting criterion
- ML and GL allow for a fit test (N-1) minimum of
the fitting function is distributed as ?2 - Partial vs Full information techniques
15 Model Fit
16Types of Fit
- Absolute
- Comparative
- Parsimonious
17Absolute fit The ?2 test
- Available for ML and GLS
- Tests the null that ??(?)
- Distributed with 1/2(q)(q1)-k df where q is the
number of variables and k is the number of
estimated parameters - Power
- Logical problem of accepting the null
18Criteria For Fit Indices (Gerbing Anderson,
1992)
- Indicate degree of fit along a bounded continuum
(normed) - Be independent of sample size
- Have known distributional properties
- No fit indices (except possibly the RMSEA) meet
these criteria
19Indices of Absolute Fit
- RMR Standardized RMR
- RMSEA
- GFI
- AGFI
- ?2/df
20Condition 9 Tests
- Tests of individual parameters
- Called t values but are interpreted as Z scores
- Problems
- Overall fit but parameters are not significant.
- Overall fit but parameters are in opposite
direction. - Lack of fit but all parameters as predicted
21Comparative Model Fit
- Null Model (Independence Model)
- Saturated Model
- Measures of absolute fit test the distance from
the saturated model (i.e., are tests of
identifying restrictions). - Measures of comparative fit typically test the
distance from the null model.
22Comparative Fit Indices
- Normed Fit Index (NFI)
- Non-Normed Fit Index(NNFI)
- Incremental Fit Index(IFI)
- Comparative Fit Index(CFI)
- Relative Fit Index(RFI)
- Expected Cross-Validation Index(ECVI)
23Parsimonious Fit
- Fit (both absolute and comparative) increases
with the number of parameters estimated. - Rewards researcher for estimating trivial paths
- Parsimonious fit adjusts for the df in the model
and penalizes accordingly - Tend to reward the estimation of significant (and
only significant) paths
24Indices of Parsimonious Fit
- Parsimonious Normed Fit Index (PNFI)
- Parsimonious Goodness of Fit Index(PGFI)
- Akaike Information Criterion(AIC)
- Consistent Akaike Information Criterion (CAIC)
25Nested Model Comparisons
- Compare two (theoretically generated) plausible
models of the data - If the models stand in nested sequence (one model
is completely contained in the other) then the
difference may be tested with a ?2difference test - Subtract the two ?2 values and the result is
distribute as ?2 with df equal to the
difference in model dfs
26Strategy for assessing fit
- Compare competing and theoretically plausible
models - Identify sources of ambiguity a priori
- Using multiple indices/definitions of fit
- Recognize that fit does not equate to truth or
validity
27 Model Modification
28Model Modification
- Theory trimming (significance tests)
- Theory Building (modification indices)
- Replication - holdout samples
- Simultaneous estimation
- What percentage or researchers would find
themselves unable to think up a theoretical
justification for freeing a parameter? In the
absence of empirical information, I assume that
the answer is near zero (Steiger, 1990 p. 175)
29LISRELThe beauty and the horror
30LISREL Files
- Run in batch (with limited interactivity)
- Written in the SIMPLIS language
- Three tasks
- Specify the data
- Specify the model
- Specify the output
31Example 1 A regression Model
32Example_1 .spl
- Janes Safety Data (regression)
- Observed Variables Injury Training Tfl Passive
- Covariance Matrix
- 1.13
- -.05 .096
- -.279 -.092 1.973
- .439 .067 -.807 2.406
- Sample Size 129
- Equation Injury Training Tfl Passive
- End of Problem
33The following lines were read from file
C\Documents and Settings\Kevin Kelloway\My
Documents\Example_1.spj Janes Safety Data
(regression) Observed Variables Injury
Training Tfl Passive Covariance Matrix 1.13
-.05 .096 -.279 -.092 1.973 .439 .067
-.807 2.406 Sample Size 129 Equation Injury
Training Tfl Passive End of Problem Sample
Size 129 Janes Safety Data (regression)
Covariance Matrix Injury
Training Tfl Passive
-------- -------- -------- --------
Injury 1.13 Training -0.05
0.10 Tfl -0.28 -0.09 1.97
Passive 0.44 0.07 -0.81
2.41 Janes Safety Data (regression)
Number of
Iterations 0 LISREL Estimates (Maximum
Likelihood)
Structural Equations Injury - 0.74Training
- 0.11Tfl 0.17Passive, Errorvar. 0.99 , R²
0.12 (0.29)
(0.069) (0.062)
(0.13) -2.51
-1.55 2.70
7.91 Goodness of Fit
Statistics Degrees
of Freedom 0 Minimum Fit
Function Chi-Square 0.0 (P 1.00)
Normal Theory Weighted Least Squares Chi-Square
0.00 (P 1.00) The Model is
Saturated, the Fit is Perfect ! Parameter
Estimates (B weights (not beta) from regression
Standard Errors t values ( gt 2.00 is
significant)
- Lorem ipsum dolor sit amet, consectetuer
adipiscing elit. Donec enim. Fusce libero nisi,
feugiat nec, tincidunt eu, accumsan non, justo.
Pellentesque mauris. In sit amet velit et libero
sollicitudin volutpat. Donec sodales eros id
magna. Ut vel neque eget metus sollicitudin
semper. - Phasellus vitae augue sed pede convallis laoreet.
Class aptent taciti sociosqu ad litora torquent
per conubia nostra, per inceptos hymenaeos. Nulla
posuere, nibh ut dictum lacinia, ipsum augue
dignissim felis, quis volutpat felis diam at
enim.
34Alternative ways of specifying the model
- Relationships
- Injury Training - Passive
- Paths
- Training Passive -gt Injury
35Graphical Interface
- Add the words Path Diagram just before the End
of Problem Statement - Theory Trimming
- Theory Building
36Strategy for assessing fit
- Compare competing and theoretically plausible
models - Identify sources of ambiguity a priori
- Using multiple indices/definitions of fit
- Recognize that fit does not equate to truth or
validity
37Example 2 A Path Analysis(observed variable)
38Model Specification
Trust
Leadership
Wellbeing
Efficacy
- Basic hypotheses
- - a leadership affects wellbeing
- b effects are indirect being mediated by
- trust and self efficacy
- A FULLY MEDIATED Model
39Model Identification
- t rule is met
- Null B rule (no relationships among the
endogenous variables) - e.g., a multiple
regression equation - Recursive rule - Recursive models are identified
- Rank and Order conditions - essentially allows
for non-recursive models, need a unique predictor
for one of the variables in a non-recursive
relationship
40- Example_2.spl (Output Model Data)
- Leadership Data Fully mediated model
- Observed Variables Wellbeing Trust Efficacy
Leadership - Means
- 22.3035294 4.9588235 3.9641765
10.4242353 - Standard Deviations
- 3.9405502 .8590221 .6941727 3.1419617
- Correlations
- 1.0000000
- -.2361636 1.0000000
- -.1746880 .1860385 1.0000000
- -.1441248 .4604753 .1907934 1.0000000
- sample size 425
- Paths
- Trust Efficacy -gtWellbeing
- Leadership -gtTrust Efficacy
- path diagram
- end of problem
41Run Example_2.spl
- Does the model fit?
- Are the paths significant?
- Do the data suggest changing the model?
42Generate Nested Models
The partially mediated
Trust
Wellbeing
Leadership
Efficacy
The Non-mediated
Trust
Wellbeing
Leadership
Efficacy
43NESTING SEQUENCE
Trust
Wellbeing
Leadership
Efficacy
- Both the fully mediated and the non-mediated are
nested within the partially mediated (but are not
directly comparable) - Mediation exists if a Fully mediated Fit is
not significantly different than Partially
mediated Fit and b Non-mediated Fit is
significantly worse than Partially mediated fit
44RESULTS
?2 Difference
45INTERACTIVE MODEL BUILDING
- Leadership Data Fully mediated model
- Observed Variables Wellbeing Trust Efficacy
Leadership - Means
- 22.3035294 4.9588235 3.9641765
10.4242353 - Standard Deviations
- 3.9405502 .8590221 .6941727 3.1419617
- Correlations
- 1.0000000
- -.2361636 1.0000000
- -.1746880 .1860385 1.0000000
- -.1441248 .4604753 .1907934 1.0000000
- sample size 425
- Y variables Wellbeing Trust Efficacy
- path diagram
- end of problem
46Example 3 Confirmatory Factor Analysis
47Model Development
- Union commitment literature identifies 3
components of union commitment (loyalty,
responsibility, willingness) - Does the same structure hold for commitment to
other representative groups (student union).
48Alternative Models
- One factor model is always a reasonable
alternative - Orthogonal models are always nested within
oblique models (but may be trivial) - If one generates an alternative model by
combining factors (i.e., by fixing the
interfactor correlation to 1) a nested sequence
is obtained - In this case the literature suggests both a 3
factor (loyalty, responsibility, willingness) and
a 2 factor (attitudes and behavior) model - Estimate a 1 factor, two factor and three factor
model
49Identification in CFA
- CFA models are recursive
- t rule (estimate less parameters than the number
of non-redundant elements in the covariance
matrix) - 3 indicator rule - 3 observed variables for each
latent variable - 2 indicator rule - 2 observed variables for each
latent variable and latent variables are allowed
to correlate - Both 3 indicator and 2 indicator rule assume that
unique factor loadings (error terms) are
uncorrelated - Monte Carlo research supports the use of 3
indicators with sample sizes greater than 200
50Example_3.spl (Three Factor)
student union commitment observed variables
loyal1 loyal2 loyal3 loyal4 resp1-resp3 will1 -
will3 Means 3.4095563 3.5460751 2.9044369
3.3412969 3.3378840 3.9590444 4.1945392 3.0955631
2.4744027 3.1194539 Standard deviations
.8457250 .8122099 .8862037 .7716516 .9057455
.9094416 1.0066751 .9745463 .9198310
.9910889 correlation matrix 1.0000000
.5208533 1.0000000 .5595972 .5009598
1.0000000 .6142105 .5376375 .6488150
1.0000000 .4580473 .3767890 .4840866
.4371304 1.0000000 .1999868 .1462896
.0886099 .1712668 .3078848 1.0000000
.1756045 .1628230 .0746533 .1038066 .3257948
.6222075 1.0000000 .3304682 .2064215
.3754222 .4300969 .3706724 .2633207 .2532679
1.0000000 .1235800 .1562927 .2700683
.2873694 .2426644 .1010893 .0775160 .5108488
1.0000000 .2110942 .2079855 .3015793
.3360939 .3173118 .2942106 .3576399 .5980011
.5123867 1.0000000 sample size293 latent
variables Loyal Resp Will relationships
loyal1 -loyal4 Loyal resp1 - resp3 Resp
will1 - will3 Will path diagram end of
problem.
51Example_3.spl (Two Factor)
student union commitment observed variables
loyal1 loyal2 loyal3 loyal4 resp1-resp3 will1 -
will3 Means 3.4095563 3.5460751 2.9044369
3.3412969 3.3378840 3.9590444 4.1945392 3.0955631
2.4744027 3.1194539 Standard deviations
.8457250 .8122099 .8862037 .7716516 .9057455
.9094416 1.0066751 .9745463 .9198310
.9910889 correlation matrix 1.0000000
.5208533 1.0000000 .5595972 .5009598
1.0000000 .6142105 .5376375 .6488150
1.0000000 .4580473 .3767890 .4840866
.4371304 1.0000000 .1999868 .1462896
.0886099 .1712668 .3078848 1.0000000
.1756045 .1628230 .0746533 .1038066 .3257948
.6222075 1.0000000 .3304682 .2064215
.3754222 .4300969 .3706724 .2633207 .2532679
1.0000000 .1235800 .1562927 .2700683
.2873694 .2426644 .1010893 .0775160 .5108488
1.0000000 .2110942 .2079855 .3015793
.3360939 .3173118 .2942106 .3576399 .5980011
.5123867 1.0000000 sample size293 latent
variables Att Behav relationships loyal1
-loyal4 Att resp1 - will3 Behav
path diagram end of problem.
52RESULTS
?2 Difference
53INTERACTIVE VERSION
student union commitment observed variables
loyal1 loyal2 loyal3 loyal4 resp1-resp3 will1 -
will3 Means 3.4095563 3.5460751 2.9044369
3.3412969 3.3378840 3.9590444 4.1945392 3.0955631
2.4744027 3.1194539 Standard deviations
.8457250 .8122099 .8862037 .7716516 .9057455
.9094416 1.0066751 .9745463 .9198310
.9910889 correlation matrix 1.0000000
.5208533 1.0000000 .5595972 .5009598
1.0000000 .6142105 .5376375 .6488150
1.0000000 .4580473 .3767890 .4840866
.4371304 1.0000000 .1999868 .1462896
.0886099 .1712668 .3078848 1.0000000
.1756045 .1628230 .0746533 .1038066 .3257948
.6222075 1.0000000 .3304682 .2064215
.3754222 .4300969 .3706724 .2633207 .2532679
1.0000000 .1235800 .1562927 .2700683
.2873694 .2426644 .1010893 .0775160 .5108488
1.0000000 .2110942 .2079855 .3015793
.3360939 .3173118 .2942106 .3576399 .5980011
.5123867 1.0000000 sample size293 latent
variables Loyal Resp WIll path diagram end of
problem.
54Example 4 Latent Variable Path Analysis
55Latent Variable Modeling
- CFA and Path Analysis at the same time
- Corrects structural parameters for measurement -
modeling with true as opposed to observed
scores - Increased complexity - only real advantage is
when you care about both questions of measurement
and structural relations
56Two Stage Modeling (Anderson Gerbing, 1988)
- Lack of fit may result from a the measurement
model, b the structural model, or c both - Establish the fit of the measurement model
(provides a baseline for the full model), then
move to testing structural parameters
57Generating Multiple Indicators
- Virtually all of the organizational literature
treats gossip as a bad thing - We hypothesize that gossip can be a good thing
- It enhances individual control
- It may enhance organizational citizenship
behaviors
58A Structural Model
Organizational Citizenship Behaviors
GOSSIP
CONTROL
59A measurement model
- OCB Ocb1 ocb2 ocb3 item parcels each made up
by summing 2 items - CTRL 3 single indicators (items)
- GOSSIP 4 scale scores (toldsup, hearsup,
toldcow, hearcow) - Need to assign a scale for each latent variable
- (fix a factor loading to 1 shouldnt matter
which one)
60The Full Model
Organizational Citizenship Behaviors
GOSSIP
CONTROL
Toldsup HearSup Toldcow
HearCow
Item1 Item2 Item3
ocb2
ocb3
Ocb1
61Estimate example_4.spl
- Does the model fit?
- Can it be fixed?
- (If so, how?) Identifying the problem,
resolving the problem (Hopefully) - Number of indicators
- Single indicator latent variables
62Getting Data into LISREL
- Start with a correlation matrix (Kevins
preference) - Reading from a file
- Import SPSS data
63