Multilevel modelling of multivariate ordered response data - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Multilevel modelling of multivariate ordered response data

Description:

... the probit link proportional ... Reduces to ordinary probit for binary response. ... count data, can be treated using the ordered probit model ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 15
Provided by: harveygo
Category:

less

Transcript and Presenter's Notes

Title: Multilevel modelling of multivariate ordered response data


1
Multilevel modelling of multivariate ordered
response data
  • Harvey Goldstein
  • And
  • Daphne Kounali
  • University of Bristol

2
Built on ideas and comments contributed by
  • Fiona Steele
  • Bill Browne
  • Jon Rasbash
  • James Carpenter
  • Developed from latent normal modelling ideas due
    to
  • Chib,
  • Aitchison
  • and others
  • Funded by
  • ESRC

Builds on REALCOM project. Freely available
software and training materials
at http//www.cmm.bristol.ac.uk/research/Realcom/i
ndex.shtml
3
The latent normal model
  • Goldstein, Carpenter, Kenward, and Levin (GCKL)
    propose multilevel multivariate model with
    properties
  • Responses can be at any level of data hierarchy
  • Responses can be a mixture of normal, ordered
    unordered variates
  • Observed responses related to underlying
    multivariate normal distribution
  • An MCMC algorithm provides appropriate
    transformation steps and also provides imputes
    for missing data.
  • Applications to missing data, partially observed
    data and flexible prediction systems
  • Paper submitted for publication

4
An efficient prediction system for adult
measurements based on growth in weight
  • Consider the repeated measures model for weight
    in children where an adult measure is available

The superscripts denote the level at which
response is measured. This represents a cubic
growth model with intercept and slope random at
the individual level (2) and correlated with the
individual level residual. Given an estimate of
the level 2 covariance this provides an efficient
linear prediction of the individual adult level
measure from a collection of growth measures.
5
Missing data
  • In a repeated measures design data may be missing
    in the sense that individuals do not attend for
    measurement
  • The standard model (as above) ignores this since
    it models directly the relationship with time
    (age)
  • Nevertheless, missing data due to attrition may
    have atypical values and standard ways to deal
    with this involve studying earlier
    characteristics of individuals with missing data.
  • In this paper we look instead to see if the
    number of occasions measured during two childhood
    periods is related to growth parameters and adult
    measures.

6
Joint modelling of growth and number of visits
  • The data are
  • 1000 subjects with 4859 repeated measurements of
    weight
  • nine occasions between birth and age 10
  • adult body mass index (BMI, measured in kg/m2)
    and plasma glucose (mmol/liter) with log
    transforms at around age 30.
  • The model is a cubic polynomial in age from 2-10
    years (childhood) and a quadratic from 1-2 years
    (infancy) with intercept and linear random at
    individual level, together with 4 individual
    level variables
  • Log(glucose), log(BMI), number of infancy
    occasions measured, number of childhood occasions
    measured.

7
Full model
8
The latent normal model
  • Consider the probit link proportional odds model.

where is the probability that the
observation occurs in category g (g1,,p) and
is the pdf for the standard normal
distribution.
Note that in general the linear predictor
will contain higher level random effects and
terms that arise from conditioning on all
correlated random effects from the remaining
responses. Reduces to ordinary probit for binary
response.
9
Handling count data
  • Any ordered variate, including count data, can be
    treated using the ordered probit model
  • With large numbers of categories this involves
    estimating many threshold parameters , so
    instead we propose
  • Replace threshold parameters with a smooth
    function of the count e.g. regression spline,
    fractional polynomial etc. We use simple
    polynomial

B) Fit a latent normal (1 parameter) Poisson
model to cumulative count probabilities
For reference value . Thus, given
, is determined for unit i
10
MCMC steps
  • Sample underlying normal from observed count
    data, conditioning on correlated responses
    yields MVN data
  • Sample fixed effects
  • Sample level 2 residuals
  • Sample level 1 residuals
  • Sample level 1 and level 2 covariance matrices
  • Details in GCKL and REALCOM training materials
  • Default (uniform) prior distributions assumed.
  • Mixture of Gibbs (fixed effects, level 2
    residuals and level 1 variance) and MH sampling
    (level 2 covariance matrix and threshold
    parameters)

11
(No Transcript)
12
Model A treats the number of measurement
occasions in infancy and childhood as ordered
categories where each threshold parameter is
estimated. Model B smooths the threshold
parameters using a second order polynomial and
model C fits a Poisson model.
13
Correlations at individual level
  • Variances on the diagonal, and correlations below
    the diagonal. Poisson model (MVN scale). Note
    unit variances for counts.

Note that correlations of counts with growth and
adult height are very small, implying that
attrition can be treated as random.
14
Further developments
  • Important special case is zero truncated Poisson
    e.g. no of children in a family (gt0).
  • Can be extended to cross classifications and
    further levels of nesting.
  • Covariates such as gender, height etc. can be
    incorporated as predictors or further responses
    that will be conditioned upon for linear
    prediction of adult measures.
  • Care needed when assuming a distributional form
    and smoother model can be used very generally.
  • Multiple imputation is easily incorporated when
    data, e.g. on covariates, are missing
  • Other discrete distributions, e.g. Zipf
    distribution, can be handled similarly.
  • Experimental software available. GCKL model
    software freely available at http//www.cmm.bristo
    l.ac.uk/research/Realcom/index.shtml
Write a Comment
User Comments (0)
About PowerShow.com