Opinionated - PowerPoint PPT Presentation

1 / 8
About This Presentation
Title:

Opinionated

Description:

Opinionated Lessons in Statistics by Bill Press #24 Goodness of Fit Good time now to review the universal rule-of-thumb (meta-theorem): Measurement precision improves ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 9
Provided by: utexasEdu
Category:

less

Transcript and Presenter's Notes

Title: Opinionated


1
Opinionated
Lessons
in Statistics
by Bill Press
24 Goodness of Fit
2
Good time now to review the universal
rule-of-thumb (meta-theorem) Measurement
precision improves with the amount of data N as
N-1/2
measurement precision accuracy of a fitted
parameter
Simple example
Generic example
twice the data implies about twice the c2 at any b
c2min
so fixed Dc2 implies ?2better precision
c2min
b0
3
Lets discuss Goodness of Fit (at last!)
That is a very Bayesian thing to do, since
Bayesians start with an EME set of hypotheses.
It also makes it difficult for Bayesians to deal
with the notion of a models goodness of fit.
So we must now again become frequentists for a
while!
Then the statistic
is the sum of N t2-values.
(not quite)
So, if we imagine repeated experiments (which
Bayesians refuse to do),the statistic should be
distributed as Chisquare(N).
If our experiment is very unlikely to be from
this distribution, we consider the model to be
disproved. In other words, it is a p-value test.
4
How is our fit by this test?
In our example,
This is a bit unlikely in Chisquare(20),with
(left tail) p0.0569.
In fact, if you had many repetitions of the
experiment, you would find that their c2 is not
distributed as Chisquare(20), but rather as
Chisquare(15)!Why?
the magic word isdegrees of freedom or DOF
our experiment
5
Degrees of Freedom Why is c2 with N data points
not quite the sum of N t2-values? Because
DOFs are reduced by constraints.
First consider a hypothetical situation where the
data haslinear constraints
joint distribution on all the ts, if they are
independent
Linear constraint
a hyper plane through the origin in t space!
6
Constraint is a plane cut through the origin.
Any cut through the origin of a sphere is a
circle.
So the distribution of distance from origin is
the same as a multivariate normal ball in the
lower number of dimensions. Thus, each linear
constraint reduces n by exactly 1.
We dont have explicit constraints on the yis.
But as we let the yis wiggle around (within the
distribution of each) we want to keep the MLE
estimate b0 (the parameters) fixed so as to see
how c2 is distributed for this MLE not for all
possible bs. (20 wiggling yis, 5 bis kept
fixed.)
7
Review
1. Fit for parameters by minimizing
2. (Co)variances of parameters, or confidence
regions, by the change in c2 (i.e., Dc2) from its
minimum value c2min.
3. Goodness-of-fit (accept or reject model) by
the p-value of c2min using the correct number of
DOF.
8
Dont confuse typical values of c2 with typical
values of Dc2 !
Goodness-of-fit with n N - M degrees of
freedom
this is an RV over the population of different
data sets (a frequentist concept allowing a
p-value)
we expect
Confidence intervals for parameters b
this is an RV over the population of possible
model parameters for a single data set, a concept
shared by Bayesians and frequentists
we expect
Answer Once you have a particular data set,
there is no uncertainty about what its c2min is.
Lets see how this works out in scaling with
Nc2 increases linearly with n N - M Dc2
increases as N (number of terms in sum), but also
decreasesas (N -1/2)2, since b becomes more
accurate with increasing N
universal rule of thumb
quadratic, because at minimum
Write a Comment
User Comments (0)
About PowerShow.com