Title: MCMC for Normal response multilevel models
1Lecture 7
- MCMC for Normal response multilevel models
2Lecture Contents
- Gibbs sampling recap.
- Variance components models.
- MCMC algorithm for VC model.
- MLwiN MCMC demonstration using Reunion island
dataset. - Choice of variance prior.
- Residuals, predictions etc.
- Chains for derived quantities.
3MCMC Methods (Recap)
- Goal To sample from joint posterior
distribution. - Problem For complex models this involves
multidimensional integration. - Solution It may be possible to sample from
conditional posterior distributions, - It can be shown that after convergence such a
sampling approach generates dependent samples
from the joint posterior distribution.
4Gibbs Sampling (Recap)
- When we can sample directly from the conditional
posterior distributions then such an algorithm is
known as Gibbs Sampling. - This proceeds as follows for the linear
regression example - Firstly give all unknown parameters starting
values, - Next loop through the following steps
5Gibbs Sampling ctd.
These steps are then repeated with the
generated values from this loop replacing the
starting values. The chain of values produced by
this procedure are known as a Markov chain, and
it is hoped that this chain converges to its
equilibrium distribution which is the joint
posterior distribution.
6Variance Components Model
- Yesterday you were introduced to multilevel
models and in particular the variance components
model, for example - This can be seen as an extension of a linear
model to allow for differing intercepts for each
higher level unit e.g. schools, herds, hospitals.
7Random intercepts model
- A variance components model with 1 continuous
predictor is known as a random intercepts model.
8Bayesian formulation of a variance components
model
- To formulate a variance components model in a
Bayesian framework we need to add prior
distributions for - For the Gibbs sampling algorithm that follows we
will assume (improper) uniform priors for the
fixed effects, ß and conjugate inverse Gamma
priors for the two variances. (in fact we will
use inverse ?2 priors which are a special case)
9Full Bayesian Model
- Our model is now
- We need to set starting values for all parameters
to start our Gibbs sampler
10Gibbs Sampling for VC model
These steps are then repeated with the
generated values from this loop replacing the
starting values. The chain of values produced by
this procedure are known as a Markov chain. Note
that ß is generated as a block while each uj is
updated individually.
11Step 1 Updating ß
12Step 2 Updating uj
13Step 3 Updating
14Step 4 Updating
15Algorithm Summary
- Repeat the following four steps
- 1. Generate ß from its (Multivariate) Normal
conditional distribution. - 2. Generate each uj from its Normal conditional
distribution. - 3. Generate 1/su2 from its Gamma conditional
distribution. - 3. Generate 1/se2 from its Gamma conditional
distribution.
16Gibbs Sampling for other models
- We have now looked at Gibbs sampling algorithms
for 2 models. - From these you should get the general idea of how
Gibbs sampling works. - From now on we will assume that if Gibbs sampling
is feasible for a model that the algorithm can be
generated. - When (conjugate) Gibbs Sampling is not feasible
(see day 5) we will describe alternative methods.
17Variance components model for the reunion island
dataset
- Dataset analysed in Dohoo et al. (2001)
- Response is log(calving to first service).
- 3 levels in data observations nested within cow
nested within herd. - 2 dichotomous predictors, heifer and artificial
insemination neither of which are significant for
this response.
18MCMC MLwiN DemoIGLS Equations window
19IGLS Trajectories window
20Hierarchy Viewer
21MCMC starting values and default priors
22MCMC first 50 iterations
23MCMC after 5000 iterations
24Summary for ?0
For details see next lecture!!!
25Summary for ?2v
26Summary for ?2u
So need to run for longer but see practical and
next lecture for details.
27Running for 100k iterations
Little change in estimates and all diagnostics
now happy!
28Residuals
- In MCMC the residuals are part of the model and
are updated at each iteration. - By default MLwiN stores the sum and sum of
squares for each residual so that it can
calculate their mean and sd. - It is possible to store all iterations of all
residuals but this takes up lots of memory!
29Herd level residuals in MCMC
Herd 28 in red, Herd 35 in blue
Compared to IGLS
30Choice of variance prior
- MLwiN offers 2 default variance prior choices
- Gamma(?,?) priors for precisions
- Or
- Uniform priors on the variances
- Browne (1998) and Browne and Draper (2004) looked
at the performance of these priors in detail and
compared them with the IGLS and RIGLS methods.
31Uniform on variance priors
Main difference is marginal increase in herd
level variance
32Adding in heifer predictor
As can be seen heifer appears not to be
significant at all. We look at model comparison
in lecture 9
33Heifer parameter information
34Predictions
35Informative prior for ?1
36Informative prior for ?1
37Confidence intervals for Variance Partition
Coefficients
- In our three level model there are two VPCs. One
for herd and one for cow - Note that MLwiN stores the parameters stacked in
column c1090. If we split this column up we can
look at chains for VPCs.
38Commands to create the VPC column
CODE 5 1 5000 C1089 - creates an indicator
column SPLIT c1090 c1089 C21 C22 C23 C24 c25 -
splits up 5 parameter chains CALC c26
c23/(c23c24c25) calculates chain for herd
VPC CALC c27 c24/(c23c24c25) calculates
chain for cow VPC NAME c26 HerdVPC c27 CowVPC
names the columns Note that Browne (2003)
gives a more involved method for working out
chains for ranks of residuals as you will see in
the practical!
39Herd level VPC
40Cow level VPC
41What have we covered
- Setting up a model and running it using MCMC in
MLwiN. - Looking at residuals, predictions, derived
quantities. - Choosing default priors and incorporating
informative priors. - Convergence and model comparison are to come as
are links with WinBUGS.
42Introduction to Practical
- The practical is taken from the MCMC in MLwiN
book (Browne 2003) with some modifications. - The tutorial dataset is from education and
includes pupils within schools. - You will cover similar material to this
demonstration along with some information on
convergence diagnostics.