Title: Producing data: Toward statistical inference
1Producing data - Toward statistical inference
2Parameter a number that describes the
population. It is a fixed number, although we
generally do not know its value.Statistic a
number that describes the sample. The value can
be known when we take a sample, but its value
will vary from sample to sample.Statistics are
often used to estimate parameters.
3Notation Population
Sample parameters statistics
Mean µ Standard Deviation
s sProportion p Â
4The Sampling Distribution of a statistic is the
distribution of values taken by the statistic in
all possible samples of the same size from the
same population.Â
5Bias concerns the center of the sampling
distribution. A statistic used to estimate a
parameter is unbiased if the mean of its sampling
distribution is equal to the true value of the
parameter being estimated.
6An unbiased statistic is one that will not
consistently over-estimate nor under-estimate the
value of the parameter (the truth about the
population).
7In repeated sampling, the value of the statistic
will vary. Suppose that 45 of the population
is male in one sample we might have only 42
males, while in another sample we could have 58
males. 45 is a _______________.42 and 58
are _________________.Â
8Â Random samples eliminate bias from the act of
choosing a sample, but they can still be wrong
due to the variability that occurs when we choose
at random. If the variation when we take
repeat samples from the same population is too
great, we cannot trust the results of any one
sample.Â
9If we take many random samples of the same size
from the same population, the variation from
sample to sample will follow a predictable
pattern. All of statistical inference is based
on one idea to see how trustworthy a
procedure is, we ask, What would happen if we
repeated the procedure many times?Â
10To simulate the sampling distribution of the
sample proportion , weTake a large number
of samples from the same population.Calculate
the sample proportion for each sample.Make a
histogram of the values of .Examine the
distribution for shape, center and spread, as
well as outliers or other deviations.
11(No Transcript)
12Bias concerns the center of the sampling
distribution. The variability of a statistic is
described by the spread of its sampling
distribution. This spread is determined by the
sampling design and the sample size n.
13Descriptive Statistics p-hat100 Variable
N Mean Median StDev
p-hat100 500 0.59720 0.60000
0.04761
14Descriptive Statistics phat500 Variable
N Mean Median StDev
p-hat500 500 0.60110 0.60200
0.02137
15(No Transcript)
16(No Transcript)
173.62
18Note thatthe amount of variability depends
(as long as N 100n) on n, not on N.