Title: AP STATISTICS LESSON 12 - 1
1 AP STATISTICSLESSON 12 - 1
- INFERENCE FOR A POPULATION PROPORTION
2ESSENTIAL QUESTION What are the procedures
for creating significance tests and confidence
intervals for population proportion problems?
- Objectives
- To create confidence intervals for population
proportions. - To find significance for proportion populations.
3Introduction
- We often want to answer questions about the
proportion of some outcome in a population, or to
compare proportions across several populations.
4Population Proportion ProblemsPage 685
- Example 12.1 Risky Behavior in the Age of AIDS
(estimating a single population proportion) - Example 12.2 Does Preschool Make a Difference?
(comparing two population proporations) - Example 12.3 Extracurriculars and Grades
(comparing more than two population proportions)
5Inference for a Population Proportion
- We are interested in the unknown proportion p
of a population that has some outcome. - For convenience, call the outcome we are
looking for a success.
6Sample Proportion
- ? count of successes in the sample
- count of observations in the sample
- Read the sample proportion ? as p-hat.
7Conditions for Inference
- As always, inference is based on the sampling
distribution of a statistic. - The mean is p. That is, the sample proportion p
is an unbiased estimator of the population
proportion p. The standard deviation of p is v
p(1-p)/n, provided that the population is at
least 10 times as large as the sample. If the
sample size is large enough that both np and n(1
p ) are at least 10, the distribution of p is
approximately normal.
8z Statistic
- z (p p)/ vp(1 p )/n
- The statistic z has approximately the standard
normal distribution N(0,1) if the sample is not
too small and the sample is not a large part of
the population. -
9Working Without p
- To test the null hypothesis Ho p p0 that the
unknown p has a specific value po, just replace p
by po in the z statistic and in checking the
values of np and n(1 p). - In a confidence interval for p, we have no
specific value to substitute. In large samples,
p will be close to p. So we replace p by p in
determining the values of np and n(1 p). We
also replace the standard deviation by the
standard error of p - SE vp(1 p)/n to get a confidence
interval estimate zSE -
10Conditions for Inference About a Proportion
- The data are an SRS from the population of
interest. - The population is at least 10 times as large as
the sample. - For a test Ho p po , the sample size n is so
large that both npo and - n(1 po) are 10 or more. For a confidence
interval, n is so large that both the count of
successes np and the count of the failures n( 1
p ) are 10 or more.
11Example 12.4 Page 688Are the Conditions Met?
- The sampling design was in fact a complex
stratified sample, and the survey used inference
procedures for that design. The overall effect
is close to an SRS, however. - The number of adult heterosexuals
- (the population) is much larger than 10 times
the sample size, n 2673
12Inference for a Population Proportion
- Draw an SRS of size n from a large population
with unknown proportion p of success. An
approximate level C confidence interval for p is - p zv p(1 p ) / n
- Where z is the upper (1-C)/2 standard normal
critical value. To test the hypothesis Ho p
po compute the z statistic z (p po )/vpo(1
po)/n
13Inference for Population Proportion (continued)
- In terms of a variable Z having the standard
normal distribution, the approximate P-value for
a test Ho against - Ha p gt po is P(Z z )
- Ha p lt po is P(Z z )
- Ha p ? po is 2P(Z lzl )
14Example 12.5 Page 690 Estimating Risky Behavior
- The National AIDS Behavioral Surveys found
that 170 of a sample of 2673 adult heterosexuals
had multiple partners. That is, p 0.0636. -
- A 99 confidence interval for the proportion
p of all adult heterosexuals with multiple
partners uses the standard normal critical value
z 2.576 (use the bottom row of Table C for
standard normal critical values) - We are 99 confident that the percent of
adult heterosexuals who had more than one sexual
partner in the past year lies between about 5.1
and 7.6
15Example 12.6 Page 691Binge Drinking in
College
- Binge drinking for men 5 or more drinks
(women 4 or more drinks) on at lease one
occasion within two weeks. - In a representative sample of 140 colleges
and 17,592 students (SRS), 7741 students
identified themselves as binge drinkers. - Does this constitute strong evidence that
more than 40 of all college students engage in
binge drinking? - Answer
- The P-value tells us that there is virtually
no change of obtaining a sample proportion as far
away from0.40 as p 0.44. We reject H0 and
conclude that more than 40 of U.S. college
students have engaged in binge drinking.
16Example 12.7 Page 692Is That Coin Fair?
- A coin that is balanced should come up heads half
the time in the long run. The French naturalist
Count Buffon tossed a coin 4040 times and got
2048 heads (p 0.5069) - Is this evidence that Buffons coin was not
balanced? (hint use the p-value for the
two-sided test)
Answer We failed to find good evidence against
H0 p 0.5. We cannot conclude that H0 is true,
that is, that the coin is perfectly balanced.
NOTE The test of significance only shows that
the results of Buffons 4040 tosses cant
distinguish this coin from one that is perfectly
balanced. To see what values of p are consistent
with sample results, use a confidence interval.
17Example 12.8 Page 693Confidence Interval For p
- We are 95 confident that the probailiby of a
head is between 0.4915 and 0.52223. - The confidence interval is more informative
than the text in Example 12.7.
18Choosing the Sample Size
- In planning a study, we may want to choose a
sample size that will allow us to estimate the
parameter within a given margin of error. - m z v p(1 p )/ n
- Here z is the standard normal critical value
for the level of confidence we want. - Because the margin of error involves the
sample proportion of success p, we need to guess
this value when choosing n. - Call our guess p. Here are two ways to get
p.
19Ways to Get p
- Use a guess or p based on a pilot study or on
past experience with similar studies. You should
do several calculations that cover the range of
p-values you might get. - Use p 0.5 as the guess. The margin of error m
is larger when - p 0.5, so this guess is conservative in
the sense that if we get other p when we do our
study, we will get a margin of error smaller than
planned.
20Sample Size for Desired Margin of Error
- To determine the sample size n that will yield
a level C confidence interval for a population
proportion p with a specified margin of error m,
set the following expression for the margin of
error to be less than or equal to m, and solve
for n - z vp(1 p) / n m
-
- Where p is a guessed value for the sample
proportion. The margin of error will be less
than or equal to m if you take the guess p to be
0.5
21Choosing p
- The method for finding the guess p does not
matter that much in most cases. The n you get
doesnt change much when you change p as long as
p is not too far from .5. So use the
conservative guess p 0.5 if you expect the
true p to be roughly between 0.3 and 0.7. If the
true p is close to 0 or 1, using p as your guess
will give a sample much larger than you need. So
try to use a better guess from a pilot study when
you suspect that p will be less than 0.3 or
greater than 0.7.
22Example 12.9 Page 696Determining Sample Size
for Election Polling
- Find sample size for 2.5 margin of error (sample
size n 1537), - and
- for 2 margin of error (n 2041).