MCMC Estimation for Random Effect Modelling - PowerPoint PPT Presentation

About This Presentation
Title:

MCMC Estimation for Random Effect Modelling

Description:

Missing data and multiple imputation. Missing data is proliferate in survey research. ... Multiple Imputation for Nonresponse in Surveys. New York: J. Wiley and Sons. ... – PowerPoint PPT presentation

Number of Views:207
Avg rating:3.0/5.0
Slides: 31
Provided by: mat135
Category:

less

Transcript and Presenter's Notes

Title: MCMC Estimation for Random Effect Modelling


1
MCMC Estimation for Random Effect Modelling The
MLwiN experience
  • Dr William J. Browne
  • School of Math Sciences
  • University of Nottingham

2
Contents
  • Random effect modelling and MLwiN.
  • Methods comparison Guatemalan child health
    example.
  • Extendibility of MCMC algorithms
  • Cross classified and multiple membership models.
  • Artificial insemination and Danish chicken
    examples.
  • Further Extensions.

3
Random effect models
  • Models that account for the underlying structure
    in the dataset.
  • Originally developed for nested structures
    (multilevel models), for example in education,
    pupils nested within schools.
  • An extension of linear modelling with the
    inclusion of random effects.
  • A typical 2-level model is
  • Here i indexes pupils and j indexes schools.

4
MLwiN
  • Software package designed specifically for
    fitting multilevel models.
  • Developed by a team led by Harvey Goldstein and
    Jon Rasbash at the Institute of Education in
    London over past 15 years or so. Earlier
    incarnations ML2, ML3, MLN.
  • Originally contained classical IGLS estimation
    methods for fitting models.
  • MLwiN launched in 1998 also included MCMC
    estimation.
  • My role in team was as developer of MCMC
    functionality in MLwiN during 4.5 years at the
    IOE.

5
Estimation Methods for Multilevel Models
  • Due to additional random effects no simple matrix
    formulae exist for finding estimates in
    multilevel models.
  • Two alternative approaches exist
  • Iterative algorithms e.g. IGLS, RIGLS, EM in HLM
    that alternate between estimating fixed and
    random effects until convergence. Can produce ML
    and REML estimates.
  • Simulation-based Bayesian methods e.g. MCMC that
    attempt to draw samples from the posterior
    distribution of the model.

6
Methods for non-normal responses
  • When the response variable is Binomial or Poisson
    then different algorithms are required.
  • IGLS/RIGLS methods give quasilikelihood estimates
    e.g. MQL, PQL.
  • MCMC algorithms including Metropolis Hastings
    sampling and Adaptive Rejection sampling are
    possible.
  • Numerical Quadrature can give ML estimates but is
    not without problems.

7
So why use MCMC?
  • Often gives better estimates for non-normal
    responses.
  • Gives full posterior distribution so interval
    estimates for derived quantities are easy to
    produce.
  • Can easily be extended to more complex problems.
  • Potential downside 1 Prior distributions
    required for all unknown parameters.
  • Potential downside 2 MCMC estimation is much
    slower than the IGLS algorithm.

8
The Guatemalan Child Health dataset.
  • This consists of a subsample of 2,449 respondents
    from the 1987 National Survey of Maternal and
    Child Helath, with a 3-level structure of births
    within mothers within communities.
  • The subsample consists of all women from the
    chosen communities who had some form of prenatal
    care during pregnancy. The response variable is
    whether this prenatal care was modern (physician
    or trained nurse) or not.
  • Rodriguez and Goldman (1995) use the structure of
    this dataset to consider how well
    quasi-likelihood methods compare with considering
    the dataset without the multilevel structure and
    fitting a standard logistic regression.
  • They perform this by constructing simulated
    datasets based on the original structure but with
    known true values for the fixed effects and
    variance parameters.
  • They consider the MQL method and show that the
    estimates of the fixed effects produced by MQL
    are worse than the estimates produced by standard
    logistic regression disregarding the multilevel
    structure!

9
The Guatemalan Child Health dataset.
  • Goldstein and Rasbash (1996) consider the same
    problem but use the PQL method. They show that
    the results produced by PQL 2nd order estimation
    are far better than for MQL but still biased.
  • The model in this situation is
  • In this formulation i,j and k index the level 1,
    2 and 3 units respectively.
  • The variables x1,x2 and x3 are composite scales
    at each level because the original model
    contained many covariates at each level.
  • Browne and Draper (2004) considered the hybrid
    Metropolis-Gibbs method in MLwiN and two possible
    variance priors (Gamma-1(e,e) and Uniform.

10
Simulation Results
  • The following gives point estimates (MCSE) for 4
    methods and 500 simulated datasets.

Parameter (True) MQL1 PQL2 Gamma Uniform
ß0 (0.65) 0.474 (0.01) 0.612 (0.01) 0.638 (0.01) 0.655 (0.01)
ß1 (1.00) 0.741 (0.01) 0.945 (0.01) 0.991 (0.01) 1.015 (0.01)
ß2 (1.00) 0.753 (0.01) 0.958 (0.01) 1.006 (0.01) 1.031 (0.01)
ß3 (1.00) 0.727 (0.01) 0.942 (0.01) 0.982 (0.01) 1.007 (0.01)
s2v (1.00) 0.550 (0.01) 0.888 (0.01) 1.023 (0.01) 1.108 (0.01)
s2u (1.00) 0.026 (0.01) 0.568 (0.01) 0.964 (0.02) 1.130 (0.02)
11
Simulation Results
  • The following gives interval coverage
    probabilities (90/95) for 4 methods and 500
    simulated datasets.

Parameter (True) MQL1 PQL2 Gamma Uniform
ß0 (0.65) 67.6/76.8 86.2/92.0 86.8/93.2 88.6/93.6
ß1 (1.00) 56.2/68.6 90.4/96.2 92.8/96.4 92.2/96.4
ß2 (1.00) 13.2/17.6 84.6/90.8 88.4/92.6 88.6/92.8
ß3 (1.00) 59.0/69.6 85.2/89.8 86.2/92.2 88.6/93.6
s2v (1.00) 0.6/2.4 70.2/77.6 89.4/94.4 87.8/92.2
s2u (1.00) 0.0/0.0 21.2/26.8 84.2/88.6 88.0/93.0
12
Summary of simulations
  • The Bayesian approach yields excellent bias and
    coverage results.
  • For the fixed effects, MQL performs badly but the
    other 3 methods all do well.
  • For the random effects, MQL and PQL both perform
    badly but MCMC with both priors is much better.
  • Note that this is an extreme scenario with small
    levels 1 in level 2 yet high level 2 variance and
    in other examples MQL/PQL will not be so bad.

13
Extension 1 Cross-classified models
For example, schools by neighbourhoods. Schools
will draw pupils from many different
neighbourhoods and the pupils of a neighbourhood
will go to several schools. No pure hierarchy can
be found and pupils are said to be contained
within a cross-classification of schools by
neighbourhoods

nbhd 1 nbhd 2 Nbhd 3
School 1 xx x
School 2 x x
School 3 xx x
School 4 x xxx
 
14
Notation
With hierarchical models we use a subscript
notation that has one subscript per level and
nesting is implied reading from the left. For
example, subscript pattern ijk denotes the ith
level 1 unit within the jth level 2 unit within
the kth level 3 unit. If models become
cross-classified we use the term classification
instead of level. With notation that has one
subscript per classification, that captures the
relationship between classifications, notation
can become very cumbersome. We propose an
alternative notation introduced in Browne et al.
(2001) that only has a single subscript no matter
how many classifications are in the model.
15
Single subscript notation
We write the model as
Where classification 2 is neighbourhood and
classification 3 is school. Classification 1
always corresponds to the classification at which
the response measurements are made, in this case
patients. For pupils 1 and 11 equation (1)
becomes
16
Classification diagrams
In the single subscript notation we lose
information about the relationship(crossed or
nested) between classifications. A useful way of
conveying this information is with the
classification diagram. Which has one node per
classification and nodes linked by arrows have a
nested relationship and unlinked nodes have a
crossed relationship.
School
Neighbourhood
Pupil
Cross-classified structure where pupils from a
school come from many neighbourhoods and pupils
from a neighbourhood attend several schools.
Nested structure where schools are contained
within neighbourhoods
17
Example Artificial insemination by donor
1901 women 279 donors 1328 donations 12100
ovulatory cycles response is whether conception
occurs in a given cycle
In terms of a unit diagram
Or a classification diagram
18
Model for artificial insemination data
We can write the model as
Results
Note cross-classified models can be fitted in
IGLS but are far easier to fit using MCMC
estimation.
19
Extension 2 Multiple membership models
  • When level 1 units are members of more than one
    higher level unit we describe a model for such
    data as a multiple membership model.
  • For example,
  •  Pupils change schools/classes and each
    school/class has an effect on pupil outcomes.
  • Patients are seen by more than one nurse during
    the course of their treatment.

 
20
Notation
Note that nurse(i) now indexes the set of nurses
that treat patient i and w(2)i,j is a weighting
factor relating patient i to nurse j. For
example, with four patients and three nurses, we
may have the following weights
21
Classification diagrams for multiple membership
relationships
Double arrows indicate a multiple membership
relationship between classifications.
We can mix multiple membership, crossed and
hierarchical structures in a single model.
22
Example involving nesting, crossing and multiple
membership Danish chickens
Production hierarchy 10,127 child flocks
725
houses 304 farms
Breeding hierarchy 10,127 child flocks 200 parent
flocks
As a unit diagram
As a classification diagram
23
Model and results
Note multiple membership models can be fitted in
IGLS and this model/dataset represents roughly
the most complex model that the method can
handle. Such models are far easier to fit using
MCMC estimation.
24
Further Extensions / Work in progress
  1. Multilevel factor models
  2. Response variables at different levels
  3. Missing data and multiple imputation
  4. Sample size calculations

25
Multilevel factor analysis modelling
  • In survey analysis often there are many responses
    for each individual.
  • Techniques like factor analysis are often used to
    identify underlying latent traits amongst the
    responses.
  • Multilevel factor analysis allows factor analysis
    modelling to identify factors at various
    classifications in the dataset.
  • Due to the nature of MCMC algorithms by adding a
    step to allow for multilevel factor models in
    MLwiN, cross-classified models can also be fitted
    without any additional programming!
  • See Goldstein and Browne (2002,2005) for more
    detail.

26
Responses at different levels
  • In a household survey some responses refer to
    individuals in a household while others refer to
    the household itself.
  • Models that combine these responses can be fitted
    using the IGLS algorithm in MLwiN and shouldnt
    pose any problems to MCMC.
  • The Centre for Multilevel modelling in Bristol
    are investigating such models.
  • See also Nicky Bests talk on combining datasets
    which is a similar issue.

27
Missing data and multiple imputation
  • Missing data is proliferate in survey research.
  • One popular approach to dealing with missing data
    is multiple imputation (Rubin 1987) where several
    imputed datasets are created and then the model
    of interest is fitted to each dataset and the
    estimates combined.
  • Using a multivariate normal response multilevel
    model to generate the imputations using MCMC in
    MLwiN is described in chapter 17 of Browne (2003)
  • James Carpenter (LSHTM) has begun work on macros
    in MLwiN that automate the multiple imputation
    procedure.

28
Sample size calculations
  • Another issue in data collection is how big a
    sample do we need to collect?
  • Such sample size calculations have simple
    formulae if we can assume that an independent
    sample can be generated.
  • If however we wish to account for the data
    structure in the calculation then things are more
    complex.
  • One possibility is a simulation-based approach
    similar to that used in the model comparisons
    described earlier where many datasets are
    simulated to look at the power for a fixed sample
    size.
  • Bonne Zijlstra will be joining me in Nottingham
    in January on an ESRC grant looking at such an
    approach.

29
Conclusions
  • In this talk we have attempted to show the
    flexibility of MCMC methods in attempting to
    match the complexity of data structures found in
    real problems.
  • We have also contrasted the methods with the IGLS
    algorithm.
  • Although we have used MLwiN, all the models
    considered here could also be fitted in WinBUGS.
  • WinBUGS offers even more flexibility in model
    choice for complex data structures however it is
    typically slower in fitting models that can also
    be fitted in MLwiN.
  • Genstat and the GLLAMM package in Stata also
    deserve mention as alternatives in the ML/REML
    world for the models considered.

30
References
  • Browne, W.J. (2003). MCMC Estimation in MLwiN.
    London Institute of Education, University of
    London.
  • Browne, W.J. and Draper, D. (2004). A Comparison
    of Bayesian and Likelihood-based methods for
    fitting multilevel models. University of
    Nottingham Research Report 04-01
  • Browne, W.J., Goldstein, H. and Rasbash, J.
    (2001). Multiple membership multiple
    classification (MMMC) models. Statistical
    Modelling 1 103-124.
  • Goldstein, H. and Browne, W. J. (2002).
    Multilevel factor analysis modelling using Markov
    Chain Monte Carlo (MCMC) estimation. In
    Marcoulides and Moustaki (Eds.), Latent Variable
    and Latent Structure Models. p 225-243. Lawrence
    Erlbaum, New Jersey.
  • Goldstein, H. and Browne, W.J. (2005). Multilevel
    Factor Analysis Models for Continuous and
    Discrete Data. In Maydeu-Olivares, A and McArdle,
    J.J. (Eds.), Contemporary psychometrics a
    festschrift for Roderick P. McDonald, p 453-475.
    Lawrence Erlbaum, New Jersey.
  • Goldstein, H. and Rasbash, J. (1996). Improved
    approximations for multilevel models with binary
    responses. Journal of the Royal Statistical
    Society, A. 159 505-13.
  • Rodriguez, G. and Goldman, N. (1995). An
    assessment of estimation procedures for
    multilevel models with binary responses. Journal
    of the Royal Statistical Society, Series A, 158,
    73-89.
  • Rubin, D.B. (1987). Multiple Imputation for
    Nonresponse in Surveys. New York J. Wiley and
    Sons.
Write a Comment
User Comments (0)
About PowerShow.com