Title: Confidence Intervals for Proportions
1Confidence Intervals for Proportions
2Overview and Objectives
- This chapter presents the beginning of
inferential statistics. - The two major applications of inferential
statistics involve the use of sample data to - Estimate the value of a population parameter
- Example Your local newspaper polls a random
sample of 330 voters, finding 144 who say they
will vote yes on the upcoming school budget.
Create a confidence interval to estimate the
actual sentiment of all voters (i.e. the
proportion of the population that supports the
school budget).
3Overview and Objectives (cont.)
Major applications of inferential statistics
(cont) 2.) Test some claim (or hypothesis)
about a population parameter. Example In a
recent year, of the 109,857 arrests for Federal
offenses, 29.1 were for drug offenses. Test the
claim that the drug offense rate is equal to 30.
4Overview and Objectives (cont.)
- In Chapter 19, our goal is to estimate a
population proportion. Our specific objectives
will be to - Define a proportion.
- Identify the assumptions and conditions necessary
for estimating a population proportion. - Construct a confidence interval which serves as
our estimate of the population proportion. - Calculate the specific sample size needed for our
desired level of precision and confidence.
5What Is a Proportion?
Proportion is the fraction, ratio, or
percent indicating the part of the sample or the
population having a particular trait of
interest. Example A recent survey indicated
that 137 out of 1000 students with credit cards
reported debits in excess of 500. The sample
proportion is 137/1000, or 0.137, or 13.7
percent. If we let represent the sample
proportion, X the number of successes, and n
the number of items sampled, we can determine a
sample proportion as follows.
6A Few Definitions
- A point estimate is a single value (or point)
used to approximate a population parameter. - The sample proportion is the best point
estimate of the population proportion p. - A confidence interval is a range (or an interval)
of values used to estimate the true value of a
population parameter. A confidence interval is
sometimes abbreviated as CI.
7A Confidence Interval
Example 1998 General Social Survey Question
do you agree or disagree with the following
statement It is more important for a wife to
help her husbands career than to have one
herself. Response 19 of 1823 respondents
agreed. Create a C.I. to estimate percentage
of Americans who would agree with this
statement. Recall that the sampling distribution
model of is centered at p, with standard
deviation . Since we dont know p, we
cant find the true standard deviation of the
sampling distribution model, so we need to find
the standard error
8A Confidence Interval (cont.)
- By the 68-95-99.7 Rule, we know
- about 68 of all samples will have a within 1
SE of p - about 95 of all samples will have a within 2
SEs of p - about 99.7 of all samples will have a within 3
SEs of p - We can look at this from s point of view
9A Confidence Interval (cont.)
- Consider the 95 level
- There is a 95 chance that p is no more than 2
SEs away from - So, if we reach out 2 SEs, we are 95 sure that p
will be in that interval. In other words, if we
reach out 2 SEs in either direction of , we
can be 95 confident that this interval contains
the true proportion. - This is called a 95 confidence interval
10What Does 95 Confidence Really Mean?
- Each confidence interval uses a sample statistic
to estimate a population parameter. - But, since samples vary, the statistics we use,
and thus the confidence intervals we construct,
vary as well. - The figure on the next slide shows that some of
our confidence intervals capture the true
proportion (the green horizontal line), while
others do not.
11What Does 95 Confidence Really Mean? (cont.)
- Confidence intervals from ten different samples.
Assume the true population proportion, p, is
0.20. - P0.20
- Thus, we expect 95 of all 95 confidence
intervals to contain the true parameter that they
are estimating.
This interval does not contain 0.21
12Margin of Error
We can claim, with 95 confidence, that the
interval contains the true
population proportion. The extent of the
interval on either side of is called the
margin of error (E). In general, confidence
intervals have the form estimate E. The more
confident we want to be, the larger our ME needs
to be.
13Critical Value
- The 2 in (our 95
confidence interval) came from the 68-95-99.7
Rule. - Using the z-scores table, we find that a more
exact value for our 95 confidence interval is
1.96 instead of 2. - We call 1.96 the critical value and denote it z.
- For any confidence level, we can find the
corresponding critical value.
14Critical Values (cont.)
- Example For a 90 confidence interval, the
critical value is 1.645
15Assumptions and Conditions
Here are the assumptions and the corresponding
conditions you must check before creating a
confidence interval for a proportion Independenc
e Assumption The data values are assumed to be
independent from each other. We check three
conditions to decide whether independence is
reasonable. Plausible Independence Condition Is
there any reason to believe that the data values
somehow affect each other? This condition depends
on your knowledge of the situationyou cant
check it with data.
16Assumptions and Conditions (cont.)
- Independence Assumption (cont)
- Randomization Condition Were the data sampled at
random or generated from a properly randomized
experiment? Proper randomization can help ensure
independence. - 10 Condition Is the sample size no more than
10 of the population? - Sample Size Assumption The sample needs to be
large enough for us to be able to use the CLT. - Success/Failure Condition We must expect at
least 10 successes and at least 10 failures.
17Estimating a Population Proportion
Confidence interval for the population proportion
(p) The confidence interval is often expressed
in the following equivalent formats or
18Creating a Confidence Interval for a Proportion -
Step - by - Step!
- Based on the General Social Survey results that
indicated 19 of 1823 respondents agreed with the
following statement It is more important for a
wife to help her husbands career than to have
one herself, estimate the proportion of
Americans who would support this statement. - Step 1 Think
- Identify the parameter you wish to estimate.
- Identify the population about which you wish to
make statements. - Choose and state a confidence level.
- We want to find an interval that is likely with
90 confidence to contain the true proportion, p,
of adults who believe a woman should sacrifice
her career to support her husband.
19Creating a Confidence Interval for a Proportion -
Step - by - Step!
- Step 2 Check the conditions
- Plausible Independence Condition - It is very
unlikely that any of their respondents influenced
each other. - Radomization Condition The General Social
Survey used a random sample of adults. - 10 Conditions - Although sampling was
necessarily without replacement, there are many
more U.S. adults than were sampled. The sample
is certainly less than 10 of the population.
20Creating a Confidence Interval for a Proportion -
Step - by - Step!
Step 3 Construct the confidence interval A.)
Find the critical value for a 90 confidence
level. - 1.645 B.) Find the standard error. C.)
Find the margin of error. E Z SE(p)
1.6450.009 0.0148 D.) Write the confidence
interval. 0.19 0.015 or (0.205, 0.175)
21Creating a Confidence Interval for a Proportion -
Step - by - Step!
Step 4 Conclusion Interpret the confidence
interval in the proper context. We are 90
confident that our interval captured the true
proportion. We are 90 confident that between
17.5 and 20.5 of all U.S. adults believe women
should sacrifice their own careers to support the
careers of their husbands.
22Interpretation of the Confidence Interval
- Dont Misstate What the Interval Means
- Dont suggest that the parameter (p) varies.
- Dont claim that other samples will agree with
yours. (If use a different to create the
interval the solution will be different.) - Dont be certain about the parameter. (Be sure to
include a statement about the level of confidence
in your interpretation)
23Effects of a Higher Confidence Level
If we wanted to be 95 confident, how would our
confidence level be affected? To be 95
confident, we need to change our critical value
to 1.96. Which would result in a new margin of
error. E Z SE( ) 1.960.009
0.018 Therefore our new confidence interval is
0.19 0.018 or (0.172, 0.208). Our interval has
become wider.
24Margin of Error Certainty vs. Precision
Our margin of error was 1.8. If we wanted to
reduce it to 1.5 would our level of confidence
be higher or lower? In general, confidence
intervals have the form estimate E. The more
confident we want to be, the larger our margin of
error needs to be. To be more confident, we wind
up being less precise. We need more values in our
confidence interval to be more certain. Because
of this, every confidence interval is a balance
between certainty and precision.
25Effects of a Larger Sample Size
If the General Social Survey had polled more
people, say 2000, would the intervals margin of
error have been larger or smaller? New margin of
error Increasing the sample size reduces the
variability in the sampling distribution of the
sample proportion which results in a smaller
standard error.
26Choosing Your Sample Size
In general, the sample size needed to produce a
confidence interval with a given margin of error
at a given confidence level is
where za/2 is the critical value for your
confidence level. To be safe, round up the
sample size you obtain.
27Choosing Your Sample Size
What sample size does it take to estimate the
outcome of this survey with 95 confidence and a
margin of error of 3? We will do what polling
organizations usually do and choose the most
cautious proportion, 50. We need at least
1,068 respondents to keep the margin of error as
small as 3 with a confidence level of 95
28Example Legalizing Marijuana
In August 2000, the Gallup Poll asked 507
randomly sampled adults the question Do you
think the possession of small amounts of
marijuana should be treated as a criminal
offense? Of these, 51 said Yes, 47 responded
No, and 2 said they didnt know. A.) Create a
95 confidence interval for the percentage of
adults who support the legalization of marijuana
(i.e. those who would respond No to this
question. Interpret your findings. B.) Would a
referendum proposing the legalization of
marijuana likely pass? Explain.
29Example Network TV
The Fox TV network is considering replacing one
of its prime-time crime investigation shows with
a new family-oriented comedy show. Before a
final decision is made, network executives
commission a random sample of 400 viewers. After
viewing the comedy, 250 indicated they would
watch the new show and suggested it replace the
crime investigation show. A.) Estimate the value
of the population proportion. B.) Compute the
standard error of the proportion. C.) Develop a
99 percent confidence interval for the population
proportion. D.) Interpret your findings.
30Example Environmental Protection
- In 2000, the GSS asked participants whether they
would be willing to pay much higher prices in
order to protect the environment. Of n 1154
respondents, 518 indicated a willingness to do
so. - A.) Find a 95 confidence interval for the
population proportion of adult Americans willing
to do so at the time of that survey. - B.) Interpret that interval.
31Assignment
- Read Chapter 20 Testing Hypotheses About
Proportions - Try the following exercises from Ch. 19
- 1, 3, 5, 7, 9, 13, 23, 31, 33, 37