Title: Module 3 Central Limit Theorem and Estimation
1Module 3Central Limit Theorem and Estimation
2Give a low and high value (range) for each of the
following
- What was Martin Luther King, Jr.s age at death?
- What is the length of the Nile River in miles?
- How many countries belong to OPEC?
- How many books are there in the Old Testament?
- What is the diameter of the moon in miles?
- What is the weight of an empty Boeing 747 in
pounds? - In what year was Mozart born?
- What is the gestation period of an Asian elephant
in days? - What is the air distance from London to Tokyo, in
miles? - What is the deepest know point in the ocean in
feet?
3Who will win the nomination?Rewind to 2008
NY Times Poll Feb. 28, 2008
4Chapter 6 - Normal Distributions
Distribution of the sample means. Sample size n
Central Limit Theorem
Take all possible combinations of n size from the
population. Calculate the mean of each
combination. Plot the frequency distribution of
the means. Relationship between the two
curves The mean of the sample mean probability
distribution is the population mean. The standard
deviation of the sample mean probability
distribution (also called the standard error) is
the standard deviation of the population divided
by the square root of n. For ngt30, the
distribution of the sample means is normal.
Population distribution Number N
5Chapter 7 - Estimating a population mean
It may be necessary to use a sample to figure out
what the population mean is. If you use the mean
of your sample as your estimate of your
population mean, you are making a point
estimate. Point estimates arent very reliable
because we know from central limit theorem that
sample means can fall anywhere on the
distribution of sample means. It is more reliable
to use an interval estimate based on what we know
of the sample means distribution.
Central Limit Theorem allows you to use the
sample means curve to specify a measuring stick
to capture the mean of the population based on
the percent of the distribution you want.
Distribution of sample means
80
95
Population
6Confidence interval based on percent of curve you
want to capture
80
7Increasing the confidence level captures more of
the curve but doesnt increase accuracy
95
8Increase sample size (n) to increase accuracy
The bigger the sample, the smaller the standard
error. Therefore the tighter the curve.
9Central Limit Theorem allows us to estimate where
the true mean is
Sample mean is the point estimate. Need to hedge
by ascribing a confidence interval.
80
10Chapter 7 - Estimating a population mean
- Because of the relationship between the sample
means curve and the population curve, it is
possible to use the sample mean and create an
interval about it to estimate where the true
population mean lies - The confidence level determines what percent of
the sample means curve you want to capture.
11Chapter 7 - Estimating a population mean
12Chapter 7 - Estimating a population mean large
sample ngt30
For samples where n gt 30, use z.
Looking up probability in the z-table .475 is
z1.96
13Chapter 7 - Estimating a population mean
14Chapter 7 - Estimating the population mean
small sample nlt30
For samples where n lt 30, use t. The t
distribution is a series of distributions that
depend on the size of the sample (degrees of
freedom n-1) The larger the sample, the closer
the t distribution is to the z distribution.
15T-distribution for samples smaller than 30
t distribution depends on the sample size the
smaller the sample, the more spread out the curve
is.
Smaller n
Z curve
16Chapter 7 - The t-table
The t-table is used for sample sizes smaller than
30. Probability in the tail. For confidence
level alpha/2. Degrees of freedom is number in
sample minus 1. There is a different curve for
each sample size. (Example n20, df 19) The t
value is in the body of the table. (Unlike the
z-table.) (Example t2.5395) The larger the
sample, the closer the t curve approximates the z
curve.
17Comparing the t-table to z-table ngt30
90 z1.65
Z-table 80 z1.28
95 z1.96
18Practice looking up t-table
19Practice looking up t-table
20Chapter 7 - Estimating a population proportion
21Chapter 7 - Estimating the population proportion
Using the normal distribution as an approximation
of the binomial (see p. 269 for proof), we can
also make interval estimates of the population
proportion.
This can only be done where np gt 5 and n(1-p) gt
5
22What if I want a certain level of accuracy?
23Chapter 7 - Sample size
This is the sampling error or margin of error. A
researcher may have a required maximum sampling
error. By solving for n, can determine the
necessary sample size for the maximum error.
Increasing the sample size increases the accuracy
of the estimate.
24Chapter 7 - Sample size
The same can be done for the sampling error of a
population proportion.
25Chapter 7 Summary
- Using a sample to estimate population mean based
on what percent (confidence level) of the curve
you want to capture - Three cases
- Large sample mean
- Small sample mean
- Large sample proportion
- Sample size will depend on the margin of error
youre willing to live with
26Estimation versus reality
Results versus poll for Mason Dixon
http//www.usaelectionpolls.com/2008/articles/how-
has-mason-dixon-performed-in-the-polls-022208001.h
tml