Title: Proportions
1Proportions
How do polls work and what do they tell you?
2Objectives
- Create confidence intervals for estimating a
true population proportion. - Learn how to use a CI for thedifference of two
proportions to test for independence of
twocategorical variables.
3Statistical Inference for Proportions
Population
X binary variable. p proportion
in the population having the
trait.
Sample
4X a count of the number of successes in
n trials.
Binomial Distribution involved
counts.
Now change the count to proportion of
successes.
the proportion of successes.
batting average
5For the population of all possible sample
proportions
- and the distribution is approximately
Normal.
6Sampling Distribution of p
if np gt 5 and n(1p) gt 5this a refinement of
the n ? 30 rule.
The Central Limit Theorem applies because p is a
sample average of n Bernoulli values!
7Margin of Error in using p to estimate p at
(1?)100 confidence
m.o.e.
( if np gt 5 and n(1p) gt 5 )
8(1?)100 Confidence Interval for p
if np gt 5 and n(1p) gt 5.
9Estimation of Parameters
A (1-a)100 confidence interval estimate of a
parameter is
point estimate m.o.e.
Margin of Error at (1-a)100 confidence
PopulationParameter
Point Estimator
Mean, m if s is known
Mean, m if s is unknown
p
Proportion, p
10Example 2
- The governor will spend more on promotion of a
new program he wants passed, if fewer than 50 of
registered voters support it. - In telephone survey of 200 randomly selected
registered voters, 82 say they support the
proposed program. - Construct a 95 confidence interval for the true
proportion of ALL voterswho support the proposed
program.
11Example 2.
sample proportion 82 / 200 .41
95 confidence interval for p
12 What can be concluded from
this telephone survey?
Example 2.
- The value of concern is 50. Why?
- The CI is .342 to .478.
- .50 is NOT in this CI therefore,
- .50 is not a plausible value.
- Less than 50 of the registered voters support
the proposedprogram therefore, spend more on
promotion.
13Example 3.
Election night Birminghamtwo candidates for
mayor.
Random exit poll results Sue Ellen 462
votes of 900.
Can we declare Sue Ellen the winnerat the .05
level of significance?
Hypothesized value is p .50 no favorite.
m.o.e.
.03266
1.96 ?
14Construct 95 CI
Example 3.
.51333 .03266
The 95 CI is .48067 to .54599.
Statement in L.O.P
I am 95 confident that the trueproportion of
votes cast for Sue Ellenin the Birmingham
mayoral election falls between .4807 and .5460.
15Decision
Example 3.
Does the hypothesized value fall in the CI?
Yes!
Therefore, .50 may be a plausible value so the
election is too close to call at the .05 level
of significance..
16Example 4.
In a survey about banking services, responses
were categorized by age and opinion of
services. Of the 104 respondents that were 30
years or less, 93 stated that the services were
excellent or good. Of the 46 that were over
30, 36 stated that the services were excellent
or good. Is there a dependence between
age and opinion of services?
17Example 4.
Service
Excellentor Good
Acceptable or Poor
Age
Total
93
11
104
30 or less
36
10
46
Over 30
129
21
150
Total
18Conditional probabilities
?
p1 P( Excel or Good 30 or less)
?
p2 P( Excel or Good over 30)
Are these conditional probabilitiesfar enough
apart to call the true population proportions
different?
19Estimation of Parameters
A (1-a)100 confidence interval estimate of a
parameter is
point estimate m.o.e.
Margin of Error at (1-a)100 confidence
PopulationParameter
Point Estimator
Mean, m if s is known
Mean, m if s is unknown
Proportion, p
Diff. of two means, m1 - m2 (for large sample
sizes only)
Diff. of two proportions, p1 - p2
Slope of regression line, b
Mean from a regression when X x
20Example 4.
Margin of Error for p1- p2
?
?
?
?
For 95 confidence
21Example 4.
95 Confidence Interval for the difference
of two proportions
22Example 4.
Does zero fall inside this confidence interval?
Yes!
Then zero is a plausible valuefor the
difference of the twoproportions. Therefore, the
evidence is not strongenough to say a dependence
exists.
23Example 4.
Conclusion
Age and opinion of servicemay be independent,
at the 95 confidence level, orat the 5
level of significance.
24Example 4.
The two SAMPLE proportions,
P( Excel or Good 30 or less) .894
P( Excel or Good over 30) .783
are too close together to conclude that the
corresponding POPULATION proportions are
different.
25Sample Size for Estimating m
Problem What sample size is needed to have a
margin or error less than E at (1a)100
confidence?
26What sample size is needed to estimate the mean
actual mpg with an m.o.e. of 0.2 mpg with
90confidence for Honda Accords if the pop. std.
dev. is 0.88 mpg?
Recall
52.39
27What if ? is unknown?
- Use a conservative guess (high).
- Use s from a pilot study.
- Use a very rough guess of ? such as
????????????????
28Sample Size for Estimating Proportions
What sample size is needed to have a margin or
error for estimating p less than E at (1a)100
confidence?
m.o.e. E
29But we dont know p before we take the sample!
- Use a conservative guess(one that results in a
larger n.) - p .5 is the most conservative.
- Values close to .5 are more conservative than
those near 0 or 1. - If you know that the true p should bebetween .20
and .30, then use .30.
30Example 5 What is the smallest sample size
necessary to estimate proportion of defective
parts to within .02 with 95 confidence if p is
known to not exceed 4?