Model Selection Methods - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Model Selection Methods

Description:

make biological sense (i.e. are at least conceptually acceptable) ... This involves generating pseudo data sets, fitting each model and applying the ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 22
Provided by: PaulB136
Category:

less

Transcript and Presenter's Notes

Title: Model Selection Methods


1
Model Selection Methods
  • Fish 507 Lecture 26

2
Model Selection and Model Averaging-I
  • Model Selection
  • Upon which model (of a set) should inference be
    based, i.e. which is the best model.
  • Model averaging
  • How to combine results from different models
    taking account of their relative probability /
    likelihood.
  • How to calculate the uncertainty associated with
    model choice when making predictions.

3
Model Selection and Model Averaging-II
  • Let be a quantity in which we are interested
    and let us assume we can estimate from K
    models. Assume the value of from model k is
    and that the weight assigned to model k is ,
    then
  • Model averaging uses all (or most) of the
    candidate models whereas model selection selects
    the best model (the model with the highest value
    of ).
  • How then to choose ?

4
Interlude - The Example Problem
  • Estimate the value of y(9) where the set of
    candidate models (k0,1,2,3) is

5
Model Selection(Common sense-I)
  • Only consider models that
  • make biological sense (i.e. are at least
    conceptually acceptable)
  • are not clearly mis-specified (e.g. check the
    residuals for runs) and
  • are consistent with the assumed distributional
    assumptions (i.e. errors are normal with constant
    variance when this is assumed).
  • If none of the models satisfy the above criteria,
    it is time to find a new set of models!

6
Model Selection(Common sense-II)
The constant model seems mis- specified, but lets
check that with a residual plot.
7
Model Selection(Common sense-III)
8
Model Selection(Overview)
  • Most model selection criteria are based an
    information criterion of the form
  • The various approaches differ in terms of how q
    is specified.
  • This definition leads to the following natural
    definition for the weights

9
Model Selection(Classical methods-I)
  • The options for q are
  • AIC
  • AICc
  • BIC
  • where p is the number of parameters and n is the
    number of data points (arguably difficult to
    count for models with various data types)

10
Model Selection(Classical methods-II)
The best and second best models do not differ
too much Should the conclusions of the
second-best model be ignored completely ?
11
Model Selection(Classical methods-III)
  • Model-specific predictions for y(9) 241.2,
    358.2, 347.3, and 330.44.
  • Model-averaged estimates
  • AIC 335.6
  • BIC 337.9
  • AICc 340.1
  • How to quantify the uncertainty associated with
    these predictions? The variance of a
    model-averaged quantity will be at least as large
    as if the estimate was based on the best model
    why?

12
Model Selection(Classical methods-IV)
  • Bootstrapping can be used to quantify model
    selection error. This involves generating pseudo
    data sets, fitting each model and applying the
    model selection algorithm.
  • The results of bootstrapping tell us
  • the frequency with which each model is selected
  • the variances of the model outputs conditioned on
    each model and
  • the variances of the model outputs accounting for
    model selection error.

13
Model Selection(Classical methods-V)
  • How to generate the bootstrap data sets
  • Residuals and model-predictions based on the
    model with the largest value of wk.
  • Residuals and model-predictions based on the
    model that explains the most of the variance in
    the data.
  • Residuals and model-predictions based on
    selecting each model based on its AIC (or BIC)
    weight.
  • Based on resampling the underlying raw data (with
    replacement).

14
Model Selection(Bayesian methods-I)
  • DIC (Deviance Information Criterion)
  • The value for the mean deviance (twice the
    negative log-likelihood) is evaluated from an
    MCMC chain.
  • Determining the value of the estimated deviance
    is less simple approaches include the Maximum
    Posterior Density (MPD) estimate the deviance of
    mean of the posterior and the deviance of median
    of the posterior.

15
Model Selection(Bayesian methods-II)
  • Bayes Factor (essentially the posterior odds
    ratio, assuming that the prior probability for
    each model is the same), i.e. the weight in favor
    of model k is
  • where is computed by

16
Model Selection(Bayesian methods-III)
  • Notes
  • The formula used to compute the Bayes factor can
    be numerically unstable alternative
    formulations are available.
  • The results of using DIC may be very sensitive to
    how the estimated deviance is defined.
  • Both methods require that MCMC chains (that
    converge) are available for all models.
  • Computing the posterior for a model output is
    relatively straightforward the posterior
    distribution for model k is sampled with
    probability wk.

17
Model Selection(Bayesian methods-IV)
For simplicity, uniform priors are placed on all
of the parameters
18
Model Selection(Bayesian methods-V)
19
Model Selection(Bayesian methods-VI)
20
Additional Caveat
  • Care needs to taken when choosing models for
    consideration. For example, if the same model is
    selected more than once (or a number of slight
    variants of one model are included in the set of
    models), that model will be (unintentionally)
    overweighted.
  • The bootstrapping approach outlined above may
    handle this problem.
  • Alternative priors can be assigned to each
    model.

21
References
  • Buckland, S.T., K.P. Burnham and N.H. Augustin.
    1997. Model selection An integral part of
    inference. Biometrics 53 603-618.
  • Burnham, K.P. and D.R. Anderson. 2002. Model
    selection and multi-model inference A practical
    information-theoretic approach 2nd ed. New York
    Springer.
  • Hoeting, J.A., D. Madigan, A.E. Raftery and C.T.
    Volinsky. 1999. Bayesian model averaging A
    tutorial. Statistical Science 14 382-417.
  • Kass, R.E. and A.E. Raftery. 1995. Bayes factors.
    JASA 90 773-795.
  • Spiegelhalter, D.J., N.G. Best, B.P. Carlin and
    A. van der Linde. 2002. Bayesian measures of
    model complexity and fit. Journal of the Royal
    Statistical Society B 64 583-639.
Write a Comment
User Comments (0)
About PowerShow.com