Title: Recent Advances in Statistical Ecology using Computationally Intensive Methods
1Recent Advances in Statistical Ecology using
Computationally Intensive Methods
2Overview
- Introduction to wildlife populations and
identification of questions of interest. - Motivating example.
- Issues to be addressed
- Missing data
- Model discrimination.
- Summary.
- Future research.
3Wildlife Populations
- In recent years there has been increasing
interest in wildlife populations. - Often we may be interested in population changes
over time, e.g. if there is a declining
population. (Steve Buckland) - Alternatively, we may be interested in the
underlying dynamics of the system, in order to
obtain a better understanding of the population. - We shall concentrate on this latter problem, with
particular focus on identifying factors that
affect demographic rates.
4Data Collection
- Data are often collected via some form of
capture-recapture study. - Observers go out into the field and record all
animals that are seen at a series of capture
events. - Animals may be recorded via simply resightings or
recaptures (of live animals) and recoveries (of
dead animals). - At each capture event all unmarked animals are
uniquely marked all observed animals are
recorded and subsequently released back into the
population.
5Data
- Each animal is uniquely identifiable so our data
consist of the capture histories for each
individual observed in the study. - A typical capture history may look like
- 0 1 1 0 0 1 2
- 0/1 corresponds to the individual being
unobserved/observed at that capture time and - 2 denotes an individual is recovered dead.
- We can then explicitly write down the
corresponding likelihood as a function of
survival (?), recapture (p) and recovery (?)
rates.
6Likelihood
- The likelihood is the product over all
individuals of the probability of their
corresponding capture history, conditional on
their initial capture. - For example, for an individual with history
- 0 1 1 0 0 1 2
- the contribution to the likelihood is
- ?2 p3 ?3 (1-p4) ?4 (1-p5) ?5 p6 (1-?6) ?7.
- Then, we can use this likelihood to estimate the
parameter values (either MLEs or posterior
distributions).
7Covariates
- Covariates are often used to explain temporal
heterogeneity within the parameters. - Typically these are environmental factors, such
as resource availability, weather conditions or
human intervention. - Alternatively, heterogeneity in a population can
often be explained via different (individual)
covariates. - For example, sex, condition or breeding status.
- We shall consider the survival rates to be
possibly dependent on the different covariates.
8Case Study Soay sheep
- We consider mark-recapture-recovery (MRR) data
relating to Soay sheep on the island of Hirta. - The sheep are free from human activity and
external competition/predation. - Thus, this population is ideal for investigating
the impact of different environmental and/or
individual factors on the sheep. - We consider annual data from 1986-2000, collected
on female sheep (1079 individuals).
This is joint work with Steve Brooks and Tim
Coulson
9Covariate Information
- Individual covariates
- Coat type (1dark, 2light)
- Horn type (1polled, 2scurred, 3classical)
- Birth weight (real normalised)
- Age (in years)
- Weight (real - normalised)
- Number of lambs born to the sheep in the spring
prior to summer census (0, 1, 2) - And in the spring following the census (0, 1, 2).
- Environmental covariates
- NAO index population size March rainfall
Autumn rainfall and March temperature.
10Survival Rates - ?
- Let the set of environmental covariate values at
time t be denoted by xt. - The survival rate for animal i, of age a, at time
t, is given by, - logit ?i,a,t ?a ?aT xt ?aT yi ?aT zi,t
?a,t - Here, yi denotes the set of time-independent
covariates and zi,t the time varying covariates - ?a,t N(0,?a2) denotes a random effect.
- There arise two issues here missing covariate
values and model choice.
11Issue 1 Missing Data
- The capture histories and time-invariant
covariate values data are presented in the form - Note that there are also many missing values for
the time-dependent weight covariate.
12Problem
- Given the set of covariate values, the
corresponding survival rate can be obtained, and
hence the likelihood calculated. - However, a complexity arises when there are
unknown (i.e. missing) covariate values, removing
the simple and explicit expression for the
likelihood.
13Missing Data Classical Approaches
- Typical classical approaches to missing data
problems include - Ignoring individuals with missing covariate
values - EM algorithm (can be difficult to implement and
computationally expensive) - Imputation of missing values using some
underlying model (e.g. Gompertz curve). - Conditional approach for time-varying covariates
(Catchpole, Morgan and Tavecchia, 2006 in
submission) - We consider a Bayesian approach, where we assume
an underlying model for the missing data which
allows us to account for the corresponding
uncertainty of the missing values.
14Bayesian Approach
- Suppose that we wish to make inference on the
parameters ?, and the data observed corresponds
to capture histories, c, and covariate values,
vobs. - Then, Bayes theorem states
- ?(?c, vobs) Ç L(c, vobs ?) p(?)
- The posterior distribution is very complex and so
we use Markov chain Monte Carlo (MCMC) to obtain
estimates of the posterior statistics of
interest. - However, in our case the likelihood is
analytically intractable, due to the missing
covariate values.
15Auxiliary Variables
- We treat the missing covariate values (vmis) as
parameters or auxiliary variables (AVs). - We then form the joint posterior distribution
over the parameters, ?, and AVs, given the
capture histories c and observed covariate values
yobs - ?(?, vmis c, vobs) Ç L(c ?, vmis , vobs)
- f(vmis, vobs ?) p(?)
- We can now sample from the joint posterior
distribution ?(?, vmis c, vobs). - We can integrate out over the missing covariate
values, vmis, within the MCMC algorithm to obtain
a sample from ?(? c, vobs).
16f(vmis, vobs ?) Categorical Data
- For categorical data (coat type and horn type),
we assume the following model. - Let y1,i denote the horn type of individual i.
Then, y1,i 2 1,2,3, and we assume that, - y1,i Multinomial(1,q),
- where q q1, q2, q3.
- Thus, we have additional parameters, q, which can
be regarded as the underlying probability of each
horn type. - The qs are updated within the MCMC algorithm, as
well as the y1,is which are unknown. - We assume the analogous model for coat type.
17f(vmis, vobs ?) Continuous Data
- Let y3,i denote the birth weight of individual i.
- Then, a possible model is to assume that,
- y3,i N(?, ?2),
- where ? and ?2 are to be estimated.
- For the weight of individual i, aged a at time t,
denoted by z1,i,t we set, - z1,i,t N(z1,i,t-1 ?a ?t, ?w2),
- where the parameters ?a, ?t and ?w2 are to be
estimated. - In general, the modelling assumptions will depend
on the system under study.
18Practical Implications
- Within each step of the MCMC algorithm, we not
only need to update the parameter values ?, but
also impute the missing covariate values and
random effects (if present). - This can be computationally expensive for large
amounts of missing data. - The posterior results may depend on the
underlying model for the covariates a
sensitivity analysis can be performed using
different underlying models. - (State-space modelling can also be implemented
using similar ideas see Steves talk).
19Issue 2 Model Selection
- For the sheep data set we can now deal with the
issue of missing covariate values. - But.. how do we decide which covariates to use
and/or the age structure? often there may be a
large number of possible covariates and/or age
structures. - Discriminating between different models can often
be of particular interest, since they represent
competing biological hypotheses. - Model choice can also be important for future
predictions of the system.
20Possible Models
- We only want to consider biologically plausible
models. - We have uncertainty on the age structure of the
survival rates, and consider models of the form - ?1a ?a1b ?j
- k is unknown a priori and also the covariate
dependence for each age group. - We consider similar age-type models for p and ?
with possible arbitrary time dependence. - E.g. ?1(N,BW), ?27(W,L),?8(N,P)/p(t)/?1,?2(t)
- The number of possible models is immense!!
21Bayesian Approach
- We treat the model itself to be an unknown
parameter to be estimated. - Then, applying Bayes Theorem we obtain the
posterior distribution over both parameter and
model space - ?(?m, m data) Ç L(data ?m, m) p(?m) p(m).
- Here ?m denotes the parameters in model m.
22Reversible Jump MCMC
- The reversible jump (RJ)MCMC algorithm allows us
to construct a Markov chain with stationary
distribution equal to the posterior distribution. - It is simply an extension of the
Metropolis-Hastings algorithm that allows moves
between different dimensions. - This algorithm is needed because the number of
parameters, ?m, in model m, may differ between
models. - Note that this algorithm needs only one Markov
chain, regardless of the number of models.
23Posterior Model Probabilities
- Posterior model probabilities are able to
quantitatively compare different models. - The posterior probability of model m is defined
as, - ?(m data) s ?(?m,m data) d?m.
- These are simply estimated within the RJMCMC
algorithm as the proportion of the time the chain
is in the given model. - We are also able to obtain model-averaged
estimates of parameters, which takes into account
both parameter and model uncertainty.
24General Comments
- The RJMCMC algorithm is the most widely used
algorithm to explore and summarise a posterior
distribution defined jointly over parameter and
model space. - The posterior model probabilities can be
sensitive to the priors specified on the
parameters (p(?)). - The acceptance probabilities for reversible jump
moves are typically lower than MH updates. - Longer simulations are generally needed to
explore the posterior distribution. - Only a single Markov chain is necessary,
irrespective of the number of possible models!!
25Problem 1
- Constructing efficient RJ moves can be difficult.
- This is particularly the case when updating the
number of age groups for the survival rates. - This step involves
- Proposing a new age structure (typically
add/remove a change-point) - Proposing a covariate dependence for the new age
structure - Proposing new parameter values for this new
model. - It can be very difficult to construct the Markov
chain with (reasonably) high acceptance rates.
26Example Reversibility Problem
- One obvious (and tried!) method for adding a
change-point would be as follows. - Suppose that we propose to split age group ac.
- We propose new age groups ab and b1c.
- Consider a small perturbation for each (non-zero)
regression parameter, e.g., - ?ab ?ac ? and ?b1c ?ac - ?.
- where ? N(0,??2).
- However, to satisfy the reversibility
constraints, a change-point can only be removed
when the covariate dependence structure is the
same for two consecutive age groups
27Example High Posterior Mass
- An alternative proposal would be to set
- ?ab ?ac (i.e. keep all parameters same for
ab). - Propose a model (in terms of covariates) for
?b1c - Choose each possible model with equal probability
(reverse move always possible) - Choose model that differs by at most one
individual and one environmental covariate. - The problem now lies in proposing sensible
parameter values for the new model. - One approach is to use posterior estimates of the
parameters from a saturated model (full
covariate dependence for some age structure) as
the mean of the proposal distribtion.
28Problem 2
- Consider the missing covariates vmis.
- Then, if the covariate is present, we have,
- ?(vmis vobs, ?, data) / L(data ?, vobs,
vmis) - f(vmis, vobs ?).
- However, if the covariate is not present in the
model, we have, - ?(vmis vobs, ?, data) / f(vmis, vobs ?).
- Thus, adding (or removing) the covariate values
may be difficult, due to (potentially) fairly
different posterior conditional distributions. - One way to avoid this is to simultaneously update
the missing covariate values in the model move.
29Soay Sheep Analysis
- We now use these techniques for analysing the
(complex) Soay sheep data. - Discussion with experts identified several points
of particular interest - the age-dependence of the parameters
- identification of the covariates influencing the
survival rates - whether random effects are present.
- We focus on these issues on the results presented.
30Prior Specification
- We place vague priors on the parameters present
in each model. - Priors also need to be specified on the models.
- Placing an equal prior on each model places a
high prior mass on models with a large number of
age groups, since the number of models increases
with the number of age groups. - Thus, we specify an equal prior probability on
each marginal age structure and a flat prior over
the covariate dependence given the age structure
or on time-dependence.
31Results Survival Rates
- The marginal models for the age groups with
largest posterior support are - Note that with probability 1, lambs have a
distinct survival rate. - Often the models with most posterior support are
close neighbours of each other.
32Covariate Dependence
BF 3
BF 20
BF Bayes factor
33Influence of Covariates
Weight
Population size
Age 1
Age 10
34Marginalisations
- Presenting only marginal results can hide some of
the more intricate details. - This is most clearly seen from another MRR data
analysis relating to the UK lapwing population. - There are two covariates time and fdays a
measure of the harshness of the winter. - Without going into too many details, we had MRR
data and survey data (estimates of total
population size). - An integrated analysis was performed, jointly
analysing both data sets.
35Marginal Models
- The marginal models with most posterior support
for the demographic parameters are - (a) ?1 1st year survival (b) ?a adult
survival - Model Post. prob. Model Post. prob.
- ?1(fdays) 0.636 ?a(fdays,t) 0.522
- ?1 0.331 ?a(fdays) 0.407
- (c) ? productivity (d) l recovery rate
- Model Post. prob. Model Post. prob.
- r 0.497 l(t) 0.730
- r(t) 0.393 l(fdays,t) 0.270
This is joint work with Steve Brooks, Chiara
Mazzetta and Steve Freeman
36Warning!
- The previous (marginal) posterior estimates hides
some of the intricate details. - The marginal models of the adult survival rate
and productivity rate are highly correlated. - In particular, the joint posterior probabilities
for these parameters are - Model Post. prob.
- ?a(fdays,t), ? 0.45
- ?a(fdays), ?(t) 0.36
- Thus, there is strong evidence that either adult
survival rate OR productivity rate is time
dependent.
37Summary
- Bayesian techniques are widely used, and allow
complex data analyses. - Covariates can be used to explain both temporal
and individual heterogeneity. - However, missing values can add another layer of
complexity and the need to make additional
assumptions. - The RJMCMC algorithm can explore the possible
models and discriminate between competing
biological hypotheses. - The analysis of the Soay sheep has stimulated new
discussion with biologists, in terms of the
factors that affect their survival rates.
38Further Research
- The development of efficient and generic RJMCMC
algorithms. - Assessing the posterior sensitivity on the model
specification for the covariates. - Investigation of the missing at random assumption
for the covariates in both classical and Bayesian
frameworks. - Prediction in the presence of time-varying
covariate information.