Likelihood Methods in Ecology - PowerPoint PPT Presentation

About This Presentation

Title:

Likelihood Methods in Ecology

Description:

The enumeration of all possible outcomes is called the sample space (S) ... Nothing, if all you are interested in is calculating properties of your sample... – PowerPoint PPT presentation

Number of Views:52

Avg rating:3.0/5.0

Slides: 26

Provided by: canh5

Learn more at: http://www.sortie-nd.org

Category:

more less

Transcript and Presenter's Notes

Title: Likelihood Methods in Ecology

1
Likelihood Methods in Ecology

November 16th 20th, 2009
Millbrook, NY
Instructors
Charles Canham and María Uriarte
Teaching Assistant
Liza Comita

2
Daily Schedule

Morning
830 920 Lecture
920 1010 Case Study or Discussion
1030 1200 Lab
Lunch 1200 130 (in this room)
Afternoon
130 220 Lecture
220 310 Lab
330 500 Lab

3
Course OutlineStatistical Inference using
Likelihood

Principles and practice of maximum likelihood
estimation
Know your data choosing appropriate likelihood
functions
Formulate statistical models as alternate
hypotheses
Find the ML estimates of the parameters of your
models
Compare alternate models and choose the most
parsimonious
Evaluate individual models
Advanced topics

Likelihood is much more than a statistical
method... (it can completely change the way you
ask and answer questions)
4
Lecture 1An Introduction to Likelihood Estimation

Probability and probability density functions
Maximum likelihood estimates (versus traditional
method of moment estimates)
Statistical inference
Classical frequentist statistics Limitations
and mental gyrations...
The likelihood alternative Basic principles
and definitions
Model comparison as a generalization of
hypothesis testing

5
A simple definition of probability for discrete
events...
...the ratio of the number of events of type A
to the total number of all possible events
(outcomes)... The enumeration of all possible
outcomes is called the sample space (S). If
there are n possible outcomes in a sample space,
S, and m of those are favorable for event A, then
the probability of event, A is given as
PA m/n
6
Probability defined more generally...

Consider an outcome X from some process that has
a set of possible outcomes S
If X and S are discrete, then PX X/S
If X is continuous, then the probability has to
be defined in the limit

Where g(x) is a probability density function (PDF)
7
The Normal Probability Density Function (PDF)
m mean s2 variance

Properties of a PDF
(1) 0 lt g(x) lt 1
(2) ? g(x) 1

8
Common PDFs...

For continuous data
Normal
Lognormal
Gamma
For discrete data
Poisson
Binomial
Multinomial
Negative Binomial

See McLaughlin (1993) A compendium of common
probability distributions in the reading list
9
Why are PDFs important?
Answer because they are used to calculate
likelihood (And in that case, they are called
likelihood functions)
10
Statistical Estimators
A statistical estimator is a function applied to
a sample of data used to estimate an unknown
population parameter (and an estimate is just
the result of applying an estimator to a sample)
11
Properties of Estimators

Some desirable properties of point estimators
(functions to estimate a fixed parameter)
Bias if the average error is zero, the estimate
is unbiased
Efficiency an estimate with the minimum
variance is the most efficient (note the most
efficient estimator is often biased)
Consistency As sample size increases, the
probability of the estimate being close to the
parameter increases
Asymptotically normal a consistent estimator
whose distribution around the true parameter ?
approaches a normal distribution with standard
deviation shrinking in proportion to
as
the sample size n grows

12
Maximum likelihood (ML) estimates versus
Method of moment (MOM) estimates
Bottom line MOM was born in the time before
computers, and was OK, ML needs computing
power, but has more desirable properties
13
Doing it MOMs way Central Moments
14
Whats wrong with MOMs way?

Nothing, if all you are interested in is
calculating properties of your sample
But MOMs formulas are generally not the best
way1 to infer estimates of the statistical
properties of the population from which the
sample was drawn
For example Population variance
(because the second central moment is a biased
underestimate of the population variance)
1 in the formal terms of bias, efficiency,
consistency, and asymptotic normality

15
The Maximum Likelihood alternative
Going back to PDFs in plain language, a PDF
allows you to calculate the probability that an
observation will take on a value (x), given the
underlying (true?) parameters of the population
16
But theres a problem

The PDF defines the probability of observing an
outcome (x), given that you already know the true
population parameter (?)
But we want to generate an estimate of ?, given
our data (x)
And, unfortunately, the two are not identical

17
Fisher and the concept of Likelihood...
The Likelihood Principle
In plain English The likelihood (L) of the
parameter estimates (?), given a sample (x) is
proportional to the probability of observing the
data, given the parameters... and this
probability is something we can calculate, using
the appropriate underlying probability model
(i.e. a PDF)
18
R.A. Fisher (1890- 1962)
http//www.economics.soton.ac.uk/staff/aldrich/fis
herguide/problik.htm Likelihood and Probability
in R. A. Fishers Statistical Methods for
Research Workers (John Aldrich) A good
summary of the evolution of Fishers ideas on
probability, likelihood, and inference Contains
links to PDFs of Fishers early papers A
second page shows the evolution of his ideas
through changes in successive editions of
Fishers books
Age 22
19
Calculating Likelihood and Log-Likelihood for
Datasets
From basic probability theory If two events (A
and B) are independent, then P(A,B) P(A)P(B)
More generally, for i 1..n independent
observations, and a vector X of observations
(xi)
But, logarithms are easier to work with, so...
20
Likelihood Surfaces
The variation in likelihood for any given set of
parameter values defines a likelihood
surface...
For a model with just 1 parameter, the surface is
simply a curve (aka a likelihood profile)
21
Support and Support Limits
Log-likelihood Support (Edwards 1992)
22
A (somewhat trivial) example

MOM vs ML estimates of the probability of
survival for a population
Data a quadrat in which 16 of 20 seedlings
survived during a census interval. (Note that in
this case, the quadrat is the unit of
observation, so sample size 1)

i.e. Given N20, x 16, what is p?
x lt- seq(0,1,0.005) y lt- dbinom(16,20,x) plot(x,y)
xwhich.max(y)
23
A more realistic example
Create some data (5 quadrats) N lt-
c(11,14,8,22,50) x lt- c(8,7,5,17,35) Calculate
the log-likelihood for each probability of
survival p lt- seq(0,1,0.005) log_likelihood lt-
rep(0,length(p)) for (i in 1length(p))
log_likelihoodi lt- sum(log(dbinom(x,N,pi)))
Plot the likelihood profile plot(p,log_likeli
hood) What probability of survival maximizes
log likelihood? pwhich.max(log_likelihood) 0.685
How does this compare to the average across
the 5 quadrats mean(x/N) 0.665
24
Focus in on the MLE
what is the log-likelihood of the
MLE? max(log_likelihood) 1 -9.46812

Things to note about log-likelihoods
They should always be negative! (if not, you
have a problem with your likelihood function)
The absolute magnitude of the log-likelihood
increases as sample size increases

25
An example with continuous data
The normal PDF
x observed m mean s2 variance
In R dnorm(x, mean 0, sd 1, log FALSE) gt
dnorm(2,2.5,1) 1 0.3520653 gt dnorm(2,2.5,1,logT
) 1 -1.043939 gt
Problem Now there are TWO unknowns needed to
calculate likelihood (the mean and the
variance)! Solution treat the variance just
like another parameter in the model, and find the
ML estimate of the variance just like you would
any other parameter (this is exactly what youll
do in the lab this morning)

Write a Comment

User Comments (0)