Michael Fahey - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Michael Fahey

Description:

diet assessment in epidemiology. foods eaten together interact ... fit SMN mixture on raw, log, square root, Box-Cox, rankit scales ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 38
Provided by: mich80
Category:
Tags: diet | fahey | michael | raw

less

Transcript and Presenter's Notes

Title: Michael Fahey


1

Identifying dietary patterns in a multivariate
mixture non-normality versus correlation
  • Michael Fahey
  • Ian White
  • July, 2007

2
Outline
  • motivation
  • semi-normal mixture model
  • sensitivity to
  • marginal non-normality
  • correlation among responses

3
Motivation
  • diet assessment in epidemiology
  • foods eaten together interact
  • holistic description using multiple dietary
    responses
  • latent characteristic (dietary pattern) induces
    correlation among responses

4
Notation
is the multivariate probability density function
for a set of food responses y
probability of membership to latent class x
class-specific density, given latent class x
5
Mixture distribution
Define dietary patterns as sub-groups having
different food consumption probability
distributions. To find sub-groups decompose the
mixture distribution into sum of K class-specific
distributions
6
Food consumption data
  • 8000 women in the EPIC-Norfolk cohort who
    completed a dietary questionnaire
  • eight food groups reflecting different dietary
    components and characteristics in data
  • consumption calculated in (g/d) or (g/d 1) on
    log, square root and Box-Cox scales

7
(No Transcript)
8
Semi-continuous data
9
Semi-normal (SMN) mixture
  • two-part density for semi-continuous responses
  • binary part (whether food was consumed)
  • continuous part (if consumed, how much)
  • the two parts considered together if food not
    consumed, continuous part set to missing

10
Two-part density
Define d 1 if y gt 0, and d 0 if y 0, and
the PDF for the two-part density is
Here, p is the probability of consumption, g(y)
is scale dependent, and in the mixture f(y x)
is replaced by f(y, d x) for semi-continuous
data.
11
Sensitivity to non-normality
  • fit SMN mixture on raw, log, square root,
    Box-Cox, rankit scales
  • means and variances of responses vary by class,
    covariances constrained to zero
  • vary K 1 to 4 and evaluate model BIC,
    pseudo-class residual analysis

12
Rankit transformation
Raw scale data are ranked, ri, and assigned their
expected value as a standard normal deviate
For example, the median on the raw scale
has rankit 0 and the 95th percentile has rankit
1.65.
13
Change in BIC by scale
?BIC BIC(K-1) - BIC(K)
14
Pseudo-residual analysis
  • randomly assign women to one latent class using
    p(x y, d) and compute residuals
  • pseudo-class residuals have good properties (Wang
    et al, JASA, 2005)
  • repeat random assignments M times for each woman
    to create N x M data
  • use Q-Q plots to evaluate class-specific
    multivariate normality on log and rankit scale

15
Pseudo-residuals log scale
16
Pseudo-residuals rankit scale
17
Sensitivity to correlation
  • vary class-specific covariance matrix, SK
  • diagonal SK DK
  • linear factor SK ??T DK
  • where ? is the coefficient of a class-constant
    linear factor, f, such that E(y) µ ?f
  • unconstrained SK CK DK
  • where CK is a matrix of covariances among
    continuous responses, and zero elsewhere

18
Covariance matrix rankit scale
19
Pseudo residuals rankit, SK CK DK
20
Concluding remarks 1
  • marginal normality leads to
  • 1) lower latent dimensionality
  • 2) better fitting class-specific models
  • SMN model removes non-normality due to clumping
    at zero identification shifts to association
    involving binary responses

21
Concluding remarks 2
  • two latent classes identified in associations
    involving binary responses
  • correlation among continuous responses may be
    uninteresting, e.g. measurement error,
    association not caused by dietary pattern
  • shifting emphasis from non-normality to
    correlation localises identification

22
Notation
is the multivariate probability density
function for a set of food variables y given
covariates z and parameter vector ? (?,?)
is the probability of membership to latent class
x given covariates z and parameter vector ?
is the class-specific density, given latent class
x, covariates z and parameter vector ?
23
Norfolk dietary questionnaire
24
Model details (1)
  • means and variances of y vary by class
  • covariances among the y constrained to zero
  • latent class probabilities and the mean of the y
    depend on one covariate, z
  • z the log of total energy intake (log kcal)

25
Model details (2)
  • Model dependence on covariates as follows

26
Sensitivity to non-normality (OLD)
  • fit SMN mixtures on raw, log, square root,
    Box-Cox, rankit scales vary K 1 to 4
  • evaluate model fit BIC, pseudo-class residual
    analysis
  • examine classification agreement between any two
    candidate solutions Rand index

27
Rand index (RI) 1
  • compares two modal classifications by considering
    all pairs of women
  • agreement if a pair are either
  • in the same class on each classification
  • or in different classes on each classification
  • RI (count of agreements) / (number of pairs)

28
Rand index (RI) 2
  • values usually range from 0 to 1
  • probability that classifications on a randomly
    chosen pair agree
  • adjusted RI takes chance agreement into account
  • invariant to permutation of class labels

29
Classification agreement (RI) among scales, K 4
30
Classification log vs others
31
Covariance matrix log scale
32
Number of parameters
Semi-normal models with K and SK as given
33
MVN patterns deviation from grand mean
34
SMN deviation from grand mean
35
Posterior probabilities
Average modal posterior probabilities by class

36
Hard versus soft classification
hard modal classification soft mixing
proportions estimated from model
37
Predicting cancer risk
Cox regression crude effect of SMN latent class
patterns on incidence of 49 cancers in
EPIC-Norfolk
Write a Comment
User Comments (0)
About PowerShow.com