Statistical Inference - PowerPoint PPT Presentation

About This Presentation

Title:

Statistical Inference

Description:

(Normal typically) Inference. Data. Likelihood perspective on ... Normal Distribution with increasing variance. Lognormal: One tail and no negative values ... – PowerPoint PPT presentation

Number of Views:33

Avg rating:3.0/5.0

Slides: 54

Provided by: mariau

Learn more at: http://www.sortie-nd.org

Category:

more less

Transcript and Presenter's Notes

Title: Statistical Inference

1
Statistical Inference
2
Parametric perspective on inference
Data
Inference
Scientific Model (Hypothesis test) Often with
linear models
Probability Model (Normal typically)
3
Likelihood perspective on inference
Data
Inference
Probability Model
Scientific Model (hypothesis)
4
An example...
The Data xi measurements of DBH on 50 trees yi
measurements of crown radius on those
trees The Scientific Model yi a b xi e
(linear relationship, with 2 parameters (a, b)
and an error term (e) (the residuals)) The
Probability Model e is normally distributed,
with Ee and variance estimated from the
observed variance of the residuals...
5
The triangle of statistical inference Model

Models clarify our understanding of nature.
Help us understand the importance (or
unimportance) of individuals processes and
mechanisms.
Since they are not hypotheses, they can never be
correct.
We dont reject models we assess their
validity.
Establish whats true by establishing which
model the data support.

6
The triangle of statistical inference Probability
distributions

Data are never clean.
Most models are deterministic, they describe the
average behavior of a system but not the noise or
variability. To compare models with data, we need
a statistical model which describes the
variability.
We must understand the the processes giving rise
to variability to select the correct probability
density function (error structure) that gives
rise to the variability or noise.

7
An example Can we predict crown radius using
tree diameter?
The Data
xi measurements of DBH
on 50 trees yi
measurements of crown radius on those trees The
Scientific Model
yi a b DBHi e The
Probability Model
e is normally distributed.
8
Why do we care about probability?

Foundation of theory of statistics.
Description of uncertainty (error).
Measurement error
Process error
Needed to understand likelihood theory which is
required for
Estimating model parameters.
Model selection (What hypothesis do data
support?).

9
Error (noise, variability) is your friend!

Classical statistics are built around the
assumption that the variability is normally
distributed.
Butnormality is in fact rare in ecology.
Non-normality is an opportunity to
Represent variability in a more realistic way.
Gain insights into the process of interest.

10
The likelihood framework
Ask biological question
Collect data
Ecological Model Model signal
Probability Model Model noise
Model selection
Estimate parameters
Estimate support regions
Answer questions
Bolker, Notes
11
Probability Concepts

An experiment is an operation with uncertain
outcome.
A sample space is a set of all possible outcomes
of an experiment.
An event is a particular outcome of an
experiment, a subset of the sample space.

12
Random Variables

A random variable is a function that assigns a
numeric value to every outcome of an experiment
(event) or sample. For instance

Tree Growth f (DBH, light, soil)
13
Functions and probability density functions
Function formula expressing a relationship
between two variables. All pdfs are
functions BUT NOT all functions are PDFs.
WE WILL TALK ABOUT THIS LATER
Functions Scientific Model
pdfs
14
Probability Density Functions properties

A function that assigns probabilities to ALL the
possible values of a random variable (x).

Probability density f(x)
x
15
Probability Density Functions Expectations

The expectation of a random variable x is the
weighted value of the possible values that x can
take, each value weighted by the probability that
x assumes it.
Analogous to center of gravity. First moment.

-1 0 1
2 p(-1)0.10 p(0)0.25 p(1)0.3
p(2)0.35
16
Probability Density Functions Variance

The variance of a random variable reflects the
spread of X values around the expected value.
Second moment of a distribution.

17
Probability Distributions

A function that assigns probabilities to the
possible values of a random variable (X).
They come in two flavors
DISCRETE outcomes are a set of discrete
possibilities such as integers (e.g, counting).
CONTINUOUS A probability distribution over a
continuous range (real numbers or the
non-negative real numbers).

18
Probability Mass Functions
For a discrete random variable, X, the
probability that x takes on a value x is a
discrete density function, f(x) also known as
probability mass or distribution function.
19
Probability Density Functions Continuous
variables
A probability density function (f(x)) gives the
probability that a random variable X takes on
values within a range.
b
ò
lt
lt

b
X
a

P
dx
)
x
(
f
a
³
0
)
x
(
f

ò

1
dx
)
x
(
f

-
a b
20
Some rules of probability
assuming independence
A
B
21
Real data Histograms
22
Histograms and PDFs
Probability density functions approximate the
distribution of finite data sets.
23
Uses of Frequency Distributions

Empirical (frequentist)
Make predictions about the frequency of a
particular event.
Judge whether an observation belongs to a
population.
Theoretical
Predictions about the distribution of the data
based on some basic assumptions about the nature
of the forces acting on a particular biological
system.
Describe the randomness in the data.

24
Some useful distributions

Discrete
Binomial Two possible outcomes.
Poisson Counts.
Negative binomial Counts.
Multinomial Multiple categorical outcomes.
Continuous
Normal.
Lognormal.
Exponential
Gamma
Beta

25
An example Seed predation
x no seeds taken
t2 ( )
t1
0 to N
Assume each seed has equal probability (p) of
being taken. Then
Normalization constant
26
Zero-inflated binomial
27
Binomial distribution Discrete events that can
take one of two values
n 20 p 0.5
Ex np Variance np(1-p) n number of sites p
prob. of survival
Example Probability of survival derived from pop
data
28
Binomial distribution
29
Poisson Distribution Counts (or getting hit in
the head by a horse)
400
0.4
300
0.3
Count
Proportion per Bar
k number of seedlings ? arrival rate
200
0.2
100
0.1
0
0.0
0
1
2
3
4
5
6
7
Alt param ?rt
POISSON
Number of Seedlings/quadrat
30
Poisson distribution
31
Example Number of seedlings in census quad.
60
0.4
Alchornea latifolia
50
0.3
40
Count
30
Proportion per Bar
0.2
20
0.1
10
0
0.0
0
10
20
30
40
50
60
70
80
90
100
Number of seedlings/trap
(Data from LFDP, Puerto Rico)
32
Clustering in space or time
33
Negative binomialTable 4.2 4.3 in HM Bycatch
Data
EX0.279 VarianceX1.56
Suggests temporal or spatial aggregation in the
data
34
Negative Binomial Counts
0.2
100
90
80
70
60
Count
50
0.1
Proportion per Bar
40
30
20
10
0.0
0
0
10
20
30
40
50
NEGBIN
Number of Seeds
35
Negative Binomial Counts
0.2
100
90
80
70
60
Count
50
0.1
Proportion per Bar
40
30
20
10
0.0
0
0
10
20
30
40
50
NEGBIN
Number of Seeds
36
Negative binomial
37
Negative Binomial Count data
30
0.2
Prestoea acuminata
20
Count
0.1
Proportion per Bar
10
0
0.0
0
10
20
30
40
50
60
70
80
90
100
No seedlings/quad.
(Data from LFDP, Puerto Rico)
38
Normal Distribution
Ex m Variance d2
Normal PDF with mean 0
39
Normal Distribution with increasing variance
40
Lognormal One tail and no negative values
0.8
0.7
0.6
f(x)
0.5
0.4
0.3
0.2
0.1
0
0
10
20
30
40
50
60
70
x
41
Lognormal Radial growth data
(Data from Date Creek, British Columbia)
42
Exponential
Count
Variable
43
Exponential Growth data (negatives assumed 0)
(Data from BCI, Panama)
44
Gamma One tail and flexibility
45
Gamma raw growth data
Alseis blackiana
Cordia bicolor
Growth (mm/yr)
(Data from BCI, Panama)
46
Beta distribution
47
Beta Light interception by crown trees
(Data from Luquillo, PR)
48
Mixture models

What do you do when your data dont fit any known
distribution?
Add covariates
Mixture models
Discrete
Continuous

49
Discrete mixtures
50
Discrete mixtureZero-inflated binomial
51
Continuous (compounded) mixtures
52
The Method of Moments

You can match up the sample values of the moments
of the distributions and match them up with the
theoretical moments.
Recall that
The MOM is a good way to get a first (but biased)
estimate of the parameters of a distribution. ML
estimators are more reliable.

53
MOM Negative binomial

Write a Comment

User Comments (0)