Title: Multilevel Binary Response Models
1Multilevel Binary Response Models
Session 4
Damon Berridge
2Multilevel Binary Response Models
- In all of the multilevel linear models
considered so far, it was assumed that the
response variable has a continuous distribution
and that the random coefficients and residuals
are normally distributed. - These models are appropriate where the expected
value of the response variable at each level may
be represented as a linear function of the
explanatory variables. - The linearity and normality assumptions can be
checked using standard graphical procedures. - There are other kinds of outcomes, however, for
which these assumptions are clearly not
realistic. An example is the model for which the
response variable is discrete. - Important instances of discrete response
variables are binary variables (e.g., success vs.
failure of whatever kind) and counts (e.g., in
the study of some kind of event, the number of
events happening in a predetermined time period).
3For a binary variable yij that has probability
mij for outcome 1 and probability 1-mij for
outcome 0, the mean is
and the variance is
4The Two-Level Logistic Model
- We start by introducing a simple two-level model
that will be used to illustrate the analysis of
binary response data. - Assume that there are
The total number of level-1 observations across
level-2 units is given by
and the level-2 model becomes
5such that
6- The distribution of yij is called a Bernoulli
distribution with parameter mij , and can be
written as
The functional form for mij
- The probit model is based upon the assumption
that the disturbances eij are independent
standard normal variates, such that
7Logit and Probit Transformations
- Interpretation of the parameter estimates
obtained from either the logit or probit
regressions are best achieved on a linear scale,
such that for a logit regression, we can
re-express mij as
- The probit model may be rewritten as
- For both the logit and probit functions, any
probability value in the range 0,1 is
transformed so that the resulting values of log
it(mij) and probit(mij) will lie between - and
.
- A further transformation of the probability
scale that is sometimes useful in modelling
binomial data is the complementary log-log
transformation. This function again transforms a
probability mij in the range 0,1 to a value in
(-, ), using the relationship log-log(1-mij).
8General Two-Level Logistic Models
9Residual Intraclass Correlation Coefficient
For binary responses, the intraclass coefficient
is often expressed in terms of the correlation
between the latent responses y . Since the
logistic distribution for the level-1 residual,
eij, implies a variance of p2/3 3.29 , this
implies that for a two-level logistic random
intercept model with an intercept variance of
, the intraclass coefficient is
For a two-level random intercept probit model,
this type of intraclass correlation becomes
10Likelihood
where
and
11 Binary response model Example C3
- Raudenbush and Bhumirat (1992) analysed data on
children repeating a grade during their time at
primary school. The data were from a national
survey of primary education in Thailand in 1988,
we use a sub set of that data here.
Raudenbush, S.W., Bhumirat, C., 1992. The
distribution of resources for primary education
and its consequences for educational achievement
in Thailand, International Journal of Educational
Research, 17, 143-164
Number of observations (rows) 7185 Number of
variables (columns) 4 The variables include the
following schoolid school identifier sex 1 if
child is male, 0 otherwise pped1 if the child
had pre primary experience, 0 otherwise repeat1
if the child repeated a grade during primary
school, 0 otherwise
12- We take as the binary response variable, the
indicator whether a child has ever repeated a
class (repeat 0 no , 1 yes). - The level-1 explanatory variables are child
gender (gender 0 girl , 1 boy) - Child pre-primary education ( pped 0 no , 1
yes). - The probability that a child will repeat a grade
during the primary years, mij, is of interest.
13Estimate a model with just a constant
where
Estimate a multilevel model with
14 Sabre commands
15As gender is a dummy variable indicating whether
the pupil is a girl or a boy, it can be helpful
to rewrite a pair of fitted models, one for each
gender. By substituting the values 1 for boy and
0 for girl in gender, we get the boy's constant
and we can write
girl
boy