Practical Model Selection and Multi-model Inference using R - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Practical Model Selection and Multi-model Inference using R

Description:

Practical Model Selection and Multi-model Inference using R Modified from on a presentation by : Eric Stolen and Dan Hunt – PowerPoint PPT presentation

Number of Views:154
Avg rating:3.0/5.0
Slides: 35
Provided by: StoleED
Category:

less

Transcript and Presenter's Notes

Title: Practical Model Selection and Multi-model Inference using R


1
Practical Model Selection and Multi-model
Inference using R
  • Modified from on a presentation by
  • Eric Stolen and Dan Hunt

2
Theory
  • This is the link with science, which is about
    understanding how the world works

3
Indigo Snake Habitat selectionDavid R.
Breininger, M. Rebecca Bolt, Michael L. Legare,
John H. Drese, and Eric D. StolenSource Journal
of Herpetology, 45(4)484-490. 2011.
  • Animal perception
  • Evolutionary Biology
  • Population Demography

http//www.seaworld.org/animal-info/animal-bytes/s
pooky-safari/eastern-indigo-snake.htm
4
Hypotheses
  • To use the Information-theoretic toolbox, we must
    be able to state a hypothesis as a statistical
    model (or more precisely an equation which allows
    us to calculate the maximum likelihood of the
    hypothesis)

http//www.seaworld.org/animal-info/animal-bytes/s
pooky-safari/eastern-indigo-snake.htm
5
Multiple Working Hypotheses
  • We operate with a set of multiple alternative
    hypotheses (models)
  • The many advantages include safeguarding
    objectivity, and allowing rigorous inference.
  • Chamberlain (1890)
  • Strong Inference - Platt (1964)
  • Karl Popper (ca. 1960) Bold Conjectures

6
Deriving the model set
  • This is the tough part (but also the creative
    part)
  • much thought needed, so dont rush
  • collaborate, seek outside advice, read the
    literature, go to meetings
  • How and When hypotheses are better than What
    hypotheses (strive to predict rather than
    describe)

7
Models Indigo Snake exampleDavid R.
Breininger, M. Rebecca Bolt, Michael L. Legare,
John H. Drese, and Eric D. StolenSource Journal
of Herpetology, 45(4)484-490. 2011.
  • Study of indigo snake habitat use
  • Response variable home range size ln(ha)
  • SEX
  • Land cover 2-3 levels (lC2)
  • weeks effort/exposure
  • Science question Is there a seasonal difference
    in habitat use between sexes?

8
Models Indigo Snake example
SEX land cover type (lc2) weeks SEX lc2 SEX
weeks llc2 weeks SEX lc2 weeks SEX lc2
SEX lc2 SEX lc2 weeks SEX lc2
http//www.herpnation.com/hn-blog/indigo-snake-sur
vival-demographics/?simple_nav_categoryjohn-c-mur
phy
9
Models Indigo Snake example
SEX land cover type (lc2) weeks SEX lc2 SEX
weeks llc2 weeks SEX lc2 weeks SEX lc2
SEX lc2 SEX lc2 weeks SEX lc2
10
Modeling
  • Trade-off between precision and bias
  • Trying to derive knowledge / advance learning
    not fit the data
  • Relationship between data (quantity and quality)
    and sophistication of the model

11
Precision-Bias Trade-off
Bias 2
Model Complexity increasing umber of Parameters
12
Precision-Bias Trade-off
variance
Bias 2
Model Complexity increasing umber of Parameters
13
Precision-Bias Trade-off
variance
Bias 2
Model Complexity increasing umber of Parameters
14
Kullback-Leibler Information
  • Basic concept from Information theory
  • The information lost when a model is used to
    represent full reality
  • Can also think of it as the distance between a
    model and full reality

15
Kullback-Leibler Information
Truth / reality
G1 (best model in set)
G2
G3
16
Kullback-Leibler Information
Truth / reality
G1 (best model in set)
G2
G3
17
Kullback-Leibler Information
Truth / reality
G1 (best model in set)
G2
G3
18
Kullback-Leibler Information
Truth / reality
G1 (best model in set)
G2
G3
The relative difference between models is constant
19
Akaikes Contributions
  • Figured out how to estimate the relative
    Kullback-Leibler distance between models in a set
    of models
  • Figured out how to link maximum likelihood
    estimation theory with expected K-L information
  • An (Akaikes) Information Criteria
  • AIC -2 loge (Lmodeli data) 2K

20
  • AICci -2loge (Likelihood of model i given the
    data) 2K (n/(n-K-1))
  • or
  • AIC 2K(K1)/(n-K-1)
  • (where K the number of parameters estimated and
    n the sample size)

21
  • AICcmin AICc for the model with the lowest AICc
    value
  • Di AICci AICcmin

22
  • wi Probgi data Model Probability (model
    probabilities)
  • evidence ratio of model i to model j wi / wj

23
  • Least Squares Regression
  • AIC n loge (s2) 2K (n/(n-K-1))
  • Where s2 RSS / n

24
  • Counting Parameters
  • K number of parameters estimated
  • Least Square Regression
  • K number of parameters 2 (for intercept s)

25
  • Counting Parameters
  • K number of parameters estimated
  • Logistic Regression
  • K number of parameters 1 (for intercept)

26
Comparing Models
Model selection based on AICc K AICc
Delta_AICc AICcWt Cum.Wt LL mod4 4 112.98
0.00 0.71 0.71 -51.99 mod7 5 114.89
1.91 0.27 0.98 -51.67 mod1 3 121.52
8.54 0.01 0.99 -57.47 mod5 4 122.27
9.29 0.01 1.00 -56.64 mod2 3 125.93
12.95 0.00 1.00 -59.67 mod6 4 128.34
15.36 0.00 1.00 -59.67 mod3 3 141.26
28.28 0.00 1.00 -67.34
Model 1 SEX ", Model 2 "ha.ln lc2", Model
3 "ha.ln weeks ", Model 4 "ha.ln SEX
lc2", Model 5 "ha.ln SEX weeks", Model 6
"ha.ln lc2 weeks", Model 7 "ha.ln SEX
lc2 weeks"
27
Model Averaging Predictions
28
Model Averaging Predictions
29
Model Averaging Predictions
30
Model Averaging Predictions
31
Model Averaging Parameters
32
Unconditional Variance Estimator
33
Unconditional Variance Estimator
34
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com