Title: Inference for Proportions One Sample
1Inference for Proportions One Sample
2Confidence Intervals
3Rate your confidence0 - 100
- Name my age within 10 years?
- within 5 years?
- within 1 year?
- Shooting a basketball at a wading pool, will make
basket? - Shooting the ball at a large trash can, will make
basket? - Shooting the ball at a carnival, will make basket?
4What happens to your confidence as the interval
gets smaller?
The larger your confidence, the wider the
interval.
5Point Estimate
- Use a single statistic based on sample data to
estimate a population parameter - Simplest approach
- But not always very precise due to variation in
the sampling distribution
6Confidence intervals
- Are used to estimate the unknown population
parameter - Formula
- estimate margin of error
7Margin of error
- Shows how accurate we believe our estimate is
- The smaller the margin of error, the more precise
our estimate of the true parameter - Formula
8Assumptions
- SRS
- Normal distribution
- n gt 10 n(1- ) gt 10
- Population is at least 10n
9Formula for Confidence interval
Note For confidence intervals, we DO NOT know p
so we MUST substitute p-hat for p in both the
SD when checking assumptions.
10Critical value (z)
- Found from the confidence level
- The upper z-score with probability p lying to its
right under the standard normal curve - Confidence level Tail Area
Z - .05 1.645
- .025 1.96
- .005 2.576
z1.645
z1.96
z2.576
90
95
99
11Confidence level
- Is the success rate of the method used to
construct the interval - Using this method, ____ of the time the
intervals constructed will contain the true
population parameter
12What does it mean to be 95 confident?
- 95 chance that p is contained in the confidence
interval - The probability that the interval contains p is
95 - The method used to construct the interval will
produce intervals that contain p 95 of the time.
13A May 2000 Gallup Poll found that 38 of a random
sample of 1012 adults said that they believe in
ghosts. Find a 95 confidence interval for the
true proportion of adults who believe in ghost.
14- Assumptions
- Have an SRS of adults
- n 1012(.38) 384.56 n(1- ) 1012(.62)
627.44 Since both are greater than 10, the
distribution can be approximated by a normal
curve - Population of adults is at least 10,1012.
Step 1 check assumptions!
Step 2 make calculations
Step 3 conclusion in context
We are 95 confident that the true proportion of
adults who believe in ghosts is between 35 and
41.
15To find sample size However, since we have
not yet taken a sample, we do not know a p-hat
(or p) to use!
Another Gallop Poll is taken in order to measure
the proportion of adults who approve of attempts
to clone humans. What sample size is necessary to
be within 0.04 of the true proportion of adults
who approve of attempts to clone humans with a
95 Confidence Interval?
16What p-hat (p) do you use when trying to find the
sample size for a given margin of error?
- .1(.9) .09
- .2(.8) .16
- .3(.7) .21
- .4(.6) .24
- .5(.5) .25
- By using .5 for p-hat, we are using the
worst-case scenario and using the largest SD in
our calculations.
17Another Gallop Poll is taken in order to measure
the proportion of adults who approve of attempts
to clone humans. What sample size is necessary to
be within 0.04 of the true proportion of adults
who approve of attempts to clone humans with a
95 Confidence Interval?
Use p-hat .5
Divide by 1.96
Square both sides
Round up on sample size
18Hypothesis Tests One Sample Proportions
19Example 1 Julie and Megan wonder if head and
tails are equally likely if a penny is spun.
They spin pennies 40 times and get 17 heads.
Should they reject the standard that pennies land
heads 50 of the time?
How can I tell if pennies really land heads 50
of the time?
Hypothesis test will help me decide!
But how do I know if this is one that I expect
to happen or is it one that is unlikely to happen?
What is their sample proportion?
20What are hypothesis tests?
- Calculations that tell us if a value occurs by
random chance or not if it is statistically
significant - Is it . . .
- a random occurrence due to variation?
- a biased occurrence due to some other reason?
21Nature of hypothesis tests -
How does a murder trial work?
- First begin by supposing the effect is NOT
present - Next, see if data provides evidence against the
supposition - Example murder trial
First - assume that the person is innocent Then
must have sufficient evidence to prove guilty
22Steps
Notice the steps are the same except we add
hypothesis statements which you will learn today
- Assumptions
- Hypothesis statements define parameters
- Calculations
- Conclusion, in context
23Assumptions for z-test
- Have an SRS from a binomial distribution
- Distribution is (approximately) normal
YES These are the same assumptions as
confidence intervals!!
Use the hypothesized parameter in the null
hypothesis to check assumptions!
24- Example 1 Julie and Megan wonder if head and
tails are equally likely if a penny is spun.
They spin pennies 40 times and get 17 heads.
Should they reject the standard that pennies land
50 of the time? - Are the assumptions met?
- Binomial Random Sample
- 40(.5) gt10 and 40(1-.5) gt10
- Infinate amount of spins gt 10(40)
-
25Writing Hypothesis statements
- Null hypothesis is the statement being tested
this is a statement of no effect or no
difference - Alternative hypothesis is the statement that we
suspect is true
H0
Ha
26The form
- Null hypothesis
- H0 parameter hypothesized value
- Alternative hypothesis
- Ha parameter hypothesized value
- Ha parameter gt hypothesized value
- Ha parameter lt hypothesized value
27Example 1 Contd. Julie and Megan wonder if head
and tails are equally likely if a penny is spun.
They spin pennies 40 times and get 17 heads.
Should they reject the standard that pennies land
50 of the time? State the hypotheses
H0 p .5 Ha p ? .5
Where p is the true proportion of heads
28Example 2 A company is willing to renew its
advertising contract with a local radio station
only if the station can prove that more than 20
of the residents of the city have heard the ad
and recognize the companys product. The radio
station conducts a random sample of 400 people
and finds that 90 have heard the ad and recognize
the product. Is this sufficient evidence
for the company to renew its contract? State the
hypotheses
H0 p .2 Ha p gt .2
Where p is the true proportion that heard the ad.
29Formula for hypothesis test
30Example 1 Contd. Test Statistics for Julie and
Megans Data
31P-values -
- The probability that the test statistic would
have a value as extreme or more than what is
actually observed
32Level of significance -
- Is the amount of evidence necessary before we
begin to doubt that the null hypothesis is true - Is the probability that we will reject the null
hypothesis, assuming that it is true - Denoted by a
- Can be any value
- Usual values 0.1, 0.05, 0.01
- Most common is 0.05
33Statistically significant
- The p-value is as small or smaller than the level
of significance (a) - If p gt a, fail to reject the null hypothesis at
the a level. - If p lt a, reject the null hypothesis at the a
level.
34Facts about p-values
- ALWAYS make decision about the null hypothesis!
- Large p-values show support for the null
hypothesis, but never that it is true! - Small p-values show support that the null is not
true. - Double the p-value for two-tail () tests
- Never accept the null hypothesis!
35Never accept the null hypothesis!
Never accept the null hypothesis!
Never accept the null hypothesis!
36At an a level of .05, would you reject or fail
to reject H0 for the given p-values?
Reject
Fail to reject
Fail to reject
Reject
37Writing Conclusions
- A statement of the decision being made (reject or
fail to reject H0) why (linkage) - A statement of the results in context. (state in
terms of Ha)
AND
38- Since the p-value lt (gt) a, I reject (fail to
reject) the H0. I do (do not) have
statistically significant evidence to suggest
that Ha.
Be sure to write Ha in context (words)!
39Example 1 Contd. The Decision
P-Value .342
Compare the P-Value to the Alpha Level .342 gt
.05 Since the P-Value is greater than the alpha
level I fail to reject that spinning a penny
lands heads 50 of the time. I do not have
statistically significant evidence to suggest
that spinning a penny is anything other than
fair.
40What? You and Jeff Spun your pennies and got 10
heads out of 40 spins? Well that not what Meg
and I got. So what now?
41You Decide Joe and Jeff decide to test the same
hypothesis but gather their own evidence. They
spin pennies 40 times and get 10 heads. Should
they reject the standard that pennies land heads
50 of the time?
42But we DID reject!
We DID NOT reject!
BOTH OF THEM!!!
Who is Correct?
Conclusion are based off of your data. It is
important however to discuss possible ERRORS that
could have been made.
43Errors in Hypothesis Tests
Every time you make a decision there is a
possibility that an error occurred.
44 Ho is True Ho is False
Reject Type I Error Correct
Fail to Reject Correct Type II Error
ERRORS
Murder Trial Revisited Actually Innocent Actually Guilty
Decision Guilty Type I Error Correct
Decision Not Guilty Correct Type II Error
45Type I Error
When you reject a null hypothesis when it is
actually true.
Denoted by alpha (a) -the level of
significance of a test
46Type II Error
When you fail to reject the null hypothesis when
it is false
Denoted by beta (ß)
47Example 2 Revisited A company is willing to
renew its advertising contract with a local radio
station only if the station can prove that more
than 20 of the residents of the city have heard
the ad and recognize the companys product. The
radio station conducts a random sample of 400
people and finds that 90 have heard the ad and
recognize the product. Is this sufficient
evidence for the company to renew its contract?
48- Assumptions
- Have an SRS of people
- np 400(.2) 80 n(1-p) 400(.8) 320 -
Since both are greater than 10, this distribution
is approximately normal. - Population of people is at least 4000.
Use the parameter in the null hypothesis to check
assumptions!
H0 p .2 where p is the true proportion of
people who Ha p gt .2 heard the ad
Use the parameter in the null hypothesis to
calculate standard deviation!
Since the p-value gta, I fail to reject the null
hypothesis. There is not sufficient evidence to
suggest that the true proportion of people who
heard the ad is greater than .2.
49What type of error could the radio station have
made?
Type I
Type II
OR
50Two-Sample Proportions Inference
51Sampling Distributions for the difference in
proportions
- When tossing pennies, the probability of the coin
landing on heads is 0.5. However, when spinning
the coin, the probability of the coin landing on
heads is 0.4. Lets investigate. - Looking at the sampling distribution of the
difference in sample proportions - What is the mean of the difference in sample
proportions (flip - spin)? -
- What is the standard deviation of the difference
in sample proportions (flip - spin)? -
- Can the sampling distribution of difference in
sample proportions (flip - spin) be approximated
by a normal distribution?
Yes, since n1p112.5, n1(1-p1)12.5, n2p210,
n2(1-p2)15 so all are at least 5)
52Assumptions
- Two, independent SRSs from populations
- Populations at least 10n
- Normal approximation for both
53Formula for confidence interval
Margin of error!
Standard error!
Note use p-hat when p is not known
54Example 1 At Community Hospital, the burn
center is experimenting with a new plasma
compress treatment. A random sample of 316
patients with minor burns received the plasma
compress treatment. Of these patients, it was
found that 259 had no visible scars after
treatment. Another random sample of 419
patients with minor burns received no plasma
compress treatment. For this group, it was found
that 94 had no visible scars after treatment.
What is the shape standard error of the
sampling distribution of the difference in the
proportions of people with visible scars between
the two groups?
Since n1p1259, n1(1-p1)57, n2p294,
n2(1-p2)325 and all gt 5, then the distribution
of difference in proportions is approximately
normal.
55Example 1 At Community Hospital, the burn
center is experimenting with a new plasma
compress treatment. A random sample of 316
patients with minor burns received the plasma
compress treatment. Of these patients, it was
found that 259 had no visible scars after
treatment. Another random sample of 419
patients with minor burns received no plasma
compress treatment. For this group, it was found
that 94 had no visible scars after treatment.
What is a 95 confidence interval of the
difference in proportion of people who had no
visible scars between the plasma compress
treatment control group?
56- Assumptions
- Have 2 independent SRS of burn patients
- Both distributions are approximately normal since
n1p1259, n1(1-p1)57, n2p294, n2(1-p2)325 and
all gt 5 - Population of burn patients is at least 7350.
Since these are all burn patients, we can add 316
419 735. If not the same you MUST list
separately.
We are 95 confident that the true difference in
the proportion of people who had no visible scars
between the plasma compress treatment control
group is between 53.7 and 65.4
57Example 2 Suppose that researchers want to
estimate the difference in proportions of people
who are against the death penalty in Texas in
California. If the two sample sizes are the
same, what size sample is needed to be within 2
of the true difference at 90 confidence?
Since both ns are the same size, you have common
denominators so add!
n 3383
58Example 3 Researchers comparing the
effectiveness of two pain medications randomly
selected a group of patients who had been
complaining of a certain kind of joint pain.
They randomly divided these people into two
groups, and then administered the painkillers.
Of the 112 people in the group who received
medication A, 84 said this pain reliever was
effective. Of the 108 people in the other group,
66 reported that pain reliever B was effective.
(BVD, p. 435) a) Construct separate 95
confidence intervals for the proportion of people
who reported that the pain reliever was
effective. Based on these intervals how do the
proportions of people who reported pain relieve
with medication A or medication B compare? b)
Construct a 95 confidence interval for the
difference in the proportions of people who may
find these medications effective.
SO which is correct?
CIA (.67, .83) CIB (.52, .70) Since the
intervals overlap, it appears that there is no
difference in the proportion of people who
reported pain relieve between the two medicines.
CI (0.017, 0.261) Since zero is not in the
interval, there is a difference in the proportion
of people who reported pain relieve between the
two medicines.
59Hypothesis statements
- H0 p1 p2
- Ha p1 gt p2
- Ha p1 lt p2
- Ha p1 ? p2
Be sure to define both p1 p2!
60Since we assume that the population proportions
are equal in the null hypothesis, the variances
are equal. Therefore, we pool the
variances!
61Formula for Hypothesis test
p1 p2 So . . . p1 p2 0
62Example 4 A forest in Oregon has an infestation
of spruce moths. In an effort to control the
moth, one area has been regularly sprayed from
airplanes. In this area, a random sample of 495
spruce trees showed that 81 had been killed by
moths. A second nearby area receives no
treatment. In this area, a random sample of 518
spruce trees showed that 92 had been killed by
the moth. Do these data indicate that the
proportion of spruce trees killed by the moth is
different for these areas?
63- Assumptions
- Have 2 independent SRS of spruce trees
- Both distributions are approximately normal since
n1p181, n1(1-p1)414, n2p292, n2(1-p2)426 and
all gt 5 - Population of spruce trees is at least 10,130.
H0 p1p2 where p1 is the true proportion
of trees killed by moths Ha p1?p2 in the
treated area p2 is the true proportion of trees
killed by moths in the untreated area
P-value 0.5547 a 0.05
Since p-value gt a, I fail to reject H0. There is
not sufficient evidence to suggest that the
proportion of spruce trees killed by the moth is
different for these areas