Title: Statistical Analysis Overview I Session 1
1Statistical Analysis Overview ISession 1
- Peg Burchinal
- Frank Porter Graham
- Child Development Institute,
- University of North Carolina-Chapel Hill
2Overview Statistical analysis overview I
- Linear models
- Nesting
- Longitudinal models
- Mixed Model ANOVA
- Multivariate Repeated Measures
- Two Level Hierarchical Linear Models
- Latent Growth Curve Models
3Overview Linear Models
- Most commonly used statistical models
- 1. t-test, Analysis of Variance/Covariance--
comparing means across groups - 2. Correlations, Multiple Regression
- estimating associations among continuous
variables.
4Linear Models
- General Model
- Yij B0 B1 X1ij B2 X2ij eij
- Assumptions
- One source of random variability (eij)
- Normally distributed error terms
- Homogeneity of variance
- Independence of observations
5Linear Models
- Equivalence of models
- T-test and ANOVA (1-way with 2 groups)
- Regression and ANOVA
- t-test Yij B0 B1 X1ij eij
- X1ij 1 if in first group, 0 if in second group
- One-way ANOVA (2 groups)
- Yij B0 B1 X1ij eij
- X1ij 1 if in first group, 0 if in second group
6Linear Models
- One-way ANOVA (p groups)
- Yij B0 B1 X1ij B2 X2ij Bp-1 Xp-1ij
eij - X1ij 1 if in first group, 0 otherwise,
- X2ij 1 if in second group, 0 otherwise,
- etc for p-1 groups (last group is reference cell)
- Regression (p predictors)
- Yij B0 B1 X1ij B2 X2ij Bp Xpij
eij - X1ij first continuous predictor
- X2ij second continuous predictor
- etc for the p predictors in the model
7Linear Models
- One-way ANCOVA (2 level factor and 1 covariate)
- Yij B0 B1 X1ij B2 X2ij eij
- X1ij 1 if in first group, 0 otherwise,
- X2ij continuous predictor
- Separate slopes ANCOVA (2 level factor and 1
covariate) - Yij B0 B1 X1ij B2 X2ij B3 X3ij eij
- X1ij 1 if in first group, 0 otherwise,
- X2ij first continuous predictor
- X3ij X1ij X2ij 0 if not in first group
- value of first continuous predictor if
in first group
8Linear Models
- Two-way ANOVA (2 2-level factors and
interaction) - Yij B0 B1 X1ij B2 X2ij B3 X3ij eij
- X1ij 1 if in first group on first factor, 0
otherwise, - X2ij 1 if in first group on second factor, 0
otherwise, - X3ij X1ij X2ij
- 1 if in first level of the fist and
second factor, 0 otherwise -
9Linear Models General Issues
- Design parameterization
- Showed Reference Cell Coding
- Effect Coding often preferable (use -.5 and .5
instead of 0 and 1) - Centering variables
- Whenever an interaction is included, you should
center your data so main effects are
interpretable - Easiest subtract sample mean from all values
- Nested data- correlated observations
10Correlations among Observations
- Many sources of nesting
- Repeated measures over time
- Clustering of students in a classroom, therapy
group, etc - Clustering of individuals in a family
- Consequence of nesting
- Standard errors are under-estimated when
observations within cluster are positively
correlated - P-values are too small when standard errors are
under-estimated
11Nesting
- Longitudinal models provide the easiest nested
model to understand - Obvious that repeated assessments of individuals
are not independent - Present various approaches to modeling
longitudinal data
12Analytic methods to address nesting
- Mixed-model repeated measures
- Multivariate repeated measures
- Hierarchical linear models
- Latent growth curves
13Overview Additional Assumption for Repeated
Measures Analyses
- General assumptions
- An adequate model to describe
- Individual patterns of change (within cluster
patterns of change) - Individual differences in developmental patterns
(between cluster patterns of change) - Both models must include
- Important covariates relevant interactions
- Represent correlations in nested factors
- (Type I error rate control)
14 General statistical assumptions
- Same outcome measured in the same metric over
time - Interval or ratio measurement a
- Normally distributed variables a
- Homogeneity of variance a
- Monotonic assessment
- Must be able index amount of change
- Unit change must be uniform across scale and age
- Standard score not great, but can be used
- If same outcome over time
- Identical items not required
- a special methods needed if assumption not met
15Longitudinal Data
16Traditional Growth Curve Analysis
- "Univariate" Analysis (Mixed Model)
- General model for one grouping variable and
linear change related to age. - Yijk b0k b1k Ageijk aik Personik eijk
- for i1,...,n individuals,
- j1,...,p occasions,
- k1,...,r groups
- with 2 fixed effect variables - Group and Age
- 3 random variables - Y, Person, E
17Univariate Growth Curves
18Mixed-Model ANOVA
- Advantages
- Estimates individual intercepts
- Corrections are available to avoid inflating test
statistics - Disadvantages
- Assumes all slopes are identical
- Deletions of individuals with missing data if
apply corrections - Cannot easily accommodate repeated measures of
predictors or multiple levels of nesting
19Profile Analysis or Multivariate Repeated
Measures Analysis
- Transforms model into separate analyses of
between- and within-factors - General model for one grouping variable and
linear change related to age - Yijk p0ik p1ik Ageijk eijk
- (individual growth curve)
- E(Yjk) b0k b1k Ageijk
- (population growth curve)
- for i1,...,n individuals,
- j1,...,p occasions,
- k1,...,r groups
20- Yijk p0ik p1ik Ageijk eijk
- E(Yjk) b0k b1k Ageijk
- where Yijk represents the j-th assessment of the
i-th individual in the k-th group, - p0ik is the intercept for the i-th subject in
the - k-th group
- b0k is the intercept for the k-th group - the
- unweighted mean of the p0ik within
- the k-th group
- p1ik is the slope for the regression of Y on Age
for - the i-th individual in the k-th group
- b1k is the slope for the regression of Y on Age
for - the k-th group - the unweighted mean of
- the p1ik within the k-th group
21Profile Analysis
22Profile Analysis
- Advantages
- Estimates individual intercepts and slopes
- Standard errors are not inflated with moderate to
large sample sizes - Disadvantages
- Case wise deletion of individuals with missing
data - Forced to use categorized nesting variable
- Cannot easily accommodate repeated measures of
predictors or multiple levels of nesting
23Hierarchical Linear Model ("Mixed-Effects Linear
Model")
- General model for one between-subjects
categorical factor and linear change related to
age. - Yijk (b0k p0ik) (b1k p1ik) Ageijk eijk
- or
- Yijk p0ik p1ik Ageijk eijk
- (Level 1 or individual growth curve)
- E(Yjk) b0k b1k Ageijk
- (Level 2 or population growth curve)
- for i1,...,n individuals,
- j1,...,p occasions,
- k1,...,r groups
- with 1 fixed effect variables - Group
- 4 random variables - Y, Individual's mean
level, Individual's change over Age, E
24- Yijk (b0k p0ik) (b1k p1ik) Ageijk eijk
- where Yijk represents the j-th assessment of the
i-th individual in the k-th group, - b0k is the intercept for the k-th group-
estimated as weighted mean of p0ik, - p0ik is the increment to the intercept for the
i-th individual in the k-th group - b1k is the slope for the regression of Y on Age
for the k-th group- estimated as weighted mean of
p1ik, - p1ik is the increment to the slope for the i-th
individual in the k-th group - eijk represents the random error of the j-th
assessment of the i-th individual in the k-th
group
25Hierarchical Linear Model
26Hierarchical Linear ModelAdvantages
- Accommodate multiple levels of nesting
- Slopes and intercepts of individual growth curves
can vary - Increased precision
- Permits missing or mistimed data
- ignorably missing data
- purposefully missing data designs
- inconsistently timed data
- 5. Allows repeated measures of predictors
- 6. Flexible specification of growth patterns
- 7. Fixed-effect parameter estimates fairly
robust
27Hierarchical Linear ModelsDisadvantages
- Assumes that an infinite number of individuals
were observed, but a "large" number is
sufficient. - Unclear what is large enough
- 2. Models can get very complicated
- 3. No direct tests of mediation
28SECCYD Example Maternal Sensitivity
- Goal determine whether maternal sensitivity
between 6m and first grade varies as a function
of - maternal education,
- maternal depression
- child gender.
29Analysis Data
- 6 15 24 36 54 G1
- Time-varying
- Maternal sensitivity
- N 1272 1240 1172 1161 1040 1004
- M 3.07 3.13 3.12 3.27 3.23 3.22
- sd .59 .55 .59 .53 .56 .58
- Maternal Depression
- 18 17 18 16 18 14
- Time-Invariant
- Maternal Education
- M (sd) 14.3 (2.49)
- Child Gender
- male 51
30Model
- Y ij p0i p1i Ageij p2ik Ageij2
- b1Depij b2 Depij x Ageij b3 Depij x Age2j
- eijk
- (individual component of growth curve)
- b0 b4 AGEij b5 AGEij2
- b6Medi b7 Medi x Ageij b7 Medi x AGEij2
- b8Malei b9Malei x Ageijk b10 Malei x AGEij2
- (group component of growth curve).
31Results
- Maternal Education Mothers with more education
show more sensitivity, and show less reduction in
sensitivity after children enter schools - Gender mothers more sensitive with girls during
early childhood, but show increasing levels of
sensitivity with boys over time - Maternal depression Depressed mothers show
less sensitivity during early childhood, but show
modest gains when children enter school
32Continuous Predictors
Mother's Sensitivity for Mothers with High School
Degree versus Bachelors Degree
33Categorical Predictors
Mother's Sensitivity for Male versus Female
Children
34Categorical Predictors
Mother's Sensitivity for Mothers with and without
Clinical Levels of Depressive Symptoms
35Analytic issues-repeated measures
- Time-varying (within-subjects) and time-invariant
(between-subjects) data - Analysis data one record per subject or one
record per subject per assessment (software
issue) - Plotting results
- Interpreting interactions
36Latent Growth Curves
- HLM Level 1 corresponds to LISREL measurement
model for Y - HLM Yip pop p1p time I eip
- LGC Yp 1 tp p ep
- 0 1 tp p ep ( endogenous variable
Y) - tY lY h e p
- where Yp is vector of observed values for person
p - h p the vector of latent growth curve
parameters for person p - e p is individual-specific vector of unknown
measurement error - and unlike the usual practice of LISREL analysis,
t Y lY parameter matrices are constrained to
contain only known values - tY 0
- lY 1 tp - this passes the Level 1 growth
curve parameters into the LISREL endogenous
constraintsLatent Growth Curves
37Latent Growth Curves
- HLM Level 2 corresponds to LISREL structural
model - HLM p Xb r
- LGC p m ( 0 0 ) p p - m
- which has the form of a reduced LISREL structural
model - h a b h z
- z p - m
- a m the group growth curve parameters
- b (0 0)
38Latent Growth Curve Model(same as HLM individual
curve)
39Latent Growth Curves Advantages
- Allows individual intercepts and slopes to vary.
- Allows for error in predictors
- Easily handles error heterogeneity and
correlated errors - Permits latent variables with multiple indicators
- Can examine patterns of change on more than one
dimension. - Easily estimates direct and indirect
(intervening) effects
40Latent Growth Curves
- Disadvantages
- Does not easily accommodate more than one level
of nesting - Easy-to-use software requires time-structured
data (M-Plus) - Number of estimated parameters gets large quickly
- Less power for testing interactions or moderating
effects - Equivalence HLM and LGC can be shown to be
interchangeable when data are time structured
41Latent Growth CurvesExample SECCYD Maternal
Sensitivity
- Goal - describe developmental patterns in
maternal sensitivity with target child from six
months to first grade - Analysis- Structural Equation Model
- Quadratic individual growth curve
- Maternal education and gender as predictors
- AMOS with FIML - due to missing data
42SECCYD Maternal Sensitivity
Bold indicates sign. at plt0.05
43SECCYD-LGC Analysis of Maternal Sensitivity
- Maternal education related to higher levels of
sensitivity over time (intercept). - Mothers are more sensitivity with girls in
general (intercept), but show nonlinear increases
in sensitivity toward boys (quadratic slope).
44Conclusions
- Growth curve analyses can provide an appropriate
and powerful analytic tools for examining
longitudinal or other types of nested data - Careful selection of analytic methods and models
is needed