Statistical Inference - PowerPoint PPT Presentation

About This Presentation
Title:

Statistical Inference

Description:

(Normal typically) Inference. Data. Likelihood perspective on ... Normal Distribution with increasing variance. Lognormal: One tail and no negative values ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 54
Provided by: mariau
Learn more at: http://www.sortie-nd.org
Category:

less

Transcript and Presenter's Notes

Title: Statistical Inference


1
Statistical Inference
2
Parametric perspective on inference
Data
Inference
Scientific Model (Hypothesis test) Often with
linear models
Probability Model (Normal typically)
3
Likelihood perspective on inference
Data
Inference
Probability Model
Scientific Model (hypothesis)
4
An example...
The Data xi measurements of DBH on 50 trees yi
measurements of crown radius on those
trees The Scientific Model yi a b xi e
(linear relationship, with 2 parameters (a, b)
and an error term (e) (the residuals)) The
Probability Model e is normally distributed,
with Ee and variance estimated from the
observed variance of the residuals...
5
The triangle of statistical inference Model
  • Models clarify our understanding of nature.
  • Help us understand the importance (or
    unimportance) of individuals processes and
    mechanisms.
  • Since they are not hypotheses, they can never be
    correct.
  • We dont reject models we assess their
    validity.
  • Establish whats true by establishing which
    model the data support.

6
The triangle of statistical inference Probability
distributions
  • Data are never clean.
  • Most models are deterministic, they describe the
    average behavior of a system but not the noise or
    variability. To compare models with data, we need
    a statistical model which describes the
    variability.
  • We must understand the the processes giving rise
    to variability to select the correct probability
    density function (error structure) that gives
    rise to the variability or noise.

7
An example Can we predict crown radius using
tree diameter?
The Data
xi measurements of DBH
on 50 trees yi
measurements of crown radius on those trees The
Scientific Model
yi a b DBHi e The
Probability Model
e is normally distributed.
8
Why do we care about probability?
  • Foundation of theory of statistics.
  • Description of uncertainty (error).
  • Measurement error
  • Process error
  • Needed to understand likelihood theory which is
    required for
  • Estimating model parameters.
  • Model selection (What hypothesis do data
    support?).

9
Error (noise, variability) is your friend!
  • Classical statistics are built around the
    assumption that the variability is normally
    distributed.
  • Butnormality is in fact rare in ecology.
  • Non-normality is an opportunity to
  • Represent variability in a more realistic way.
  • Gain insights into the process of interest.

10
The likelihood framework
Ask biological question
Collect data
Ecological Model Model signal
Probability Model Model noise
Model selection
Estimate parameters
Estimate support regions
Answer questions
Bolker, Notes
11
Probability Concepts
  • An experiment is an operation with uncertain
    outcome.
  • A sample space is a set of all possible outcomes
    of an experiment.
  • An event is a particular outcome of an
    experiment, a subset of the sample space.

12
Random Variables
  • A random variable is a function that assigns a
    numeric value to every outcome of an experiment
    (event) or sample. For instance

Tree Growth f (DBH, light, soil)
13
Functions and probability density functions
Function formula expressing a relationship
between two variables. All pdfs are
functions BUT NOT all functions are PDFs.
WE WILL TALK ABOUT THIS LATER
Functions Scientific Model
pdfs
14
Probability Density Functions properties
  • A function that assigns probabilities to ALL the
    possible values of a random variable (x).

Probability density f(x)
x
15
Probability Density Functions Expectations
  • The expectation of a random variable x is the
    weighted value of the possible values that x can
    take, each value weighted by the probability that
    x assumes it.
  • Analogous to center of gravity. First moment.

-1 0 1
2 p(-1)0.10 p(0)0.25 p(1)0.3
p(2)0.35
16
Probability Density Functions Variance
  • The variance of a random variable reflects the
    spread of X values around the expected value.
  • Second moment of a distribution.

17
Probability Distributions
  • A function that assigns probabilities to the
    possible values of a random variable (X).
  • They come in two flavors
  • DISCRETE outcomes are a set of discrete
    possibilities such as integers (e.g, counting).
  • CONTINUOUS A probability distribution over a
    continuous range (real numbers or the
    non-negative real numbers).

18
Probability Mass Functions
For a discrete random variable, X, the
probability that x takes on a value x is a
discrete density function, f(x) also known as
probability mass or distribution function.
19
Probability Density Functions Continuous
variables
A probability density function (f(x)) gives the
probability that a random variable X takes on
values within a range.
b
ò
lt
lt


b
X
a

P
dx
)
x
(
f
a
³
0
)
x
(
f

ò

1
dx
)
x
(
f

-
a b
20
Some rules of probability
assuming independence
A
B
21
Real data Histograms
22
Histograms and PDFs
Probability density functions approximate the
distribution of finite data sets.
23
Uses of Frequency Distributions
  • Empirical (frequentist)
  • Make predictions about the frequency of a
    particular event.
  • Judge whether an observation belongs to a
    population.
  • Theoretical
  • Predictions about the distribution of the data
    based on some basic assumptions about the nature
    of the forces acting on a particular biological
    system.
  • Describe the randomness in the data.

24
Some useful distributions
  • Discrete
  • Binomial Two possible outcomes.
  • Poisson Counts.
  • Negative binomial Counts.
  • Multinomial Multiple categorical outcomes.
  • Continuous
  • Normal.
  • Lognormal.
  • Exponential
  • Gamma
  • Beta

25
An example Seed predation
x no seeds taken
t2 ( )
t1
0 to N
Assume each seed has equal probability (p) of
being taken. Then
Normalization constant
26
Zero-inflated binomial
27
Binomial distribution Discrete events that can
take one of two values
n 20 p 0.5
Ex np Variance np(1-p) n number of sites p
prob. of survival
Example Probability of survival derived from pop
data
28
Binomial distribution
29
Poisson Distribution Counts (or getting hit in
the head by a horse)
400
0.4
300
0.3
Count
Proportion per Bar
k number of seedlings ? arrival rate
200
0.2
100
0.1
0
0.0
0
1
2
3
4
5
6
7
Alt param ?rt
POISSON
Number of Seedlings/quadrat
30
Poisson distribution
31
Example Number of seedlings in census quad.
60
0.4
Alchornea latifolia
50
0.3
40
Count
30
Proportion per Bar
0.2
20
0.1
10
0
0.0
0
10
20
30
40
50
60
70
80
90
100
Number of seedlings/trap
(Data from LFDP, Puerto Rico)
32
Clustering in space or time
33
Negative binomialTable 4.2 4.3 in HM Bycatch
Data
EX0.279 VarianceX1.56
Suggests temporal or spatial aggregation in the
data
34
Negative Binomial Counts
0.2
100
90
80
70
60
Count
50
0.1
Proportion per Bar
40
30
20
10
0.0
0
0
10
20
30
40
50
NEGBIN
Number of Seeds
35
Negative Binomial Counts
0.2
100
90
80
70
60
Count
50
0.1
Proportion per Bar
40
30
20
10
0.0
0
0
10
20
30
40
50
NEGBIN
Number of Seeds
36
Negative binomial
37
Negative Binomial Count data
30
0.2
Prestoea acuminata
20
Count
0.1
Proportion per Bar
10
0
0.0
0
10
20
30
40
50
60
70
80
90
100
No seedlings/quad.
(Data from LFDP, Puerto Rico)
38
Normal Distribution
Ex m Variance d2
Normal PDF with mean 0
39
Normal Distribution with increasing variance
40
Lognormal One tail and no negative values
0.8
0.7
0.6
f(x)
0.5
0.4
0.3
0.2
0.1
0
0
10
20
30
40
50
60
70
x
41
Lognormal Radial growth data
(Data from Date Creek, British Columbia)
42
Exponential
Count
Variable
43
Exponential Growth data (negatives assumed 0)
(Data from BCI, Panama)
44
Gamma One tail and flexibility
45
Gamma raw growth data
Alseis blackiana
Cordia bicolor
Growth (mm/yr)
(Data from BCI, Panama)
46
Beta distribution
47
Beta Light interception by crown trees
(Data from Luquillo, PR)
48
Mixture models
  • What do you do when your data dont fit any known
    distribution?
  • Add covariates
  • Mixture models
  • Discrete
  • Continuous

49
Discrete mixtures
50
Discrete mixtureZero-inflated binomial
51
Continuous (compounded) mixtures
52
The Method of Moments
  • You can match up the sample values of the moments
    of the distributions and match them up with the
    theoretical moments.
  • Recall that
  • The MOM is a good way to get a first (but biased)
    estimate of the parameters of a distribution. ML
    estimators are more reliable.

53
MOM Negative binomial
Write a Comment
User Comments (0)
About PowerShow.com