Title: Section 8.2
1Section 8.2
- Significance Tests About
- Proportions
2Example Are Astrologers Predictions Better
Than Guessing?
- Scientific test of astrology experiment
- For each of 116 adult volunteers, an astrologer
prepared a horoscope based on the positions of
the planets and the moon at the moment of the
persons birth - Each adult subject also filled out a California
Personality Index Survey
3Example Are Astrologers Predictions Better
Than Guessing?
- For a given adult, his or her birth data and
horoscope were shown to an astrologer together
with the results of the personality survey for
that adult and for two other adults randomly
selected from the group - The astrologer was asked which personality chart
of the 3 subjects was the correct one for that
adult, based on his or her horoscope
4Example Are Astrologers Predictions Better
Than Guessing?
- 28 astrologers were randomly chosen to take part
in the experiment - The National Council for Geocosmic Research
claimed that the probability of a correct guess
on any given trial in the experiment was larger
than 1/3, the value for random guessing
5Example Are Astrologers Predictions Better
Than Guessing?
- Put this investigation in the context of a
significance test by stating null and alternative
hypotheses
6Example Are Astrologers Predictions Better
Than Guessing?
- With random guessing, p 1/3
- The astrologers claim p 1/3
- The hypotheses for this test
- Ho p 1/3
- Ha p 1/3
7What Are the Steps of a Significance Test about a
Population Proportion?
- Step 1 Assumptions
- The variable is categorical
- The data are obtained using randomization
- The sample size is sufficiently large that the
sampling distribution of the sample proportion is
approximately normal - np 15 and n(1-p) 15
8What Are the Steps of a Significance Test about a
Population Proportion?
- Step 2 Hypotheses
- The null hypothesis has the form
- Ho p po
- The alternative hypothesis has the form
- Ha p po (one-sided test) or
- Ha p
- Ha p ? po (two-sided test)
9What Are the Steps of a Significance Test about a
Population Proportion?
- Step 3 Test Statistic
- The test statistic measures how far the sample
proportion falls from the null hypothesis value,
po, relative to what wed expect if Ho were true - The test statistic is
10What Are the Steps of a Significance Test about a
Population Proportion?
- Step 4 P-value
- The P-value summarizes the evidence
- It describes how unusual the data would be if H0
were true
11What Are the Steps of a Significance Test about a
Population Proportion?
- Step 5 Conclusion
- We summarize the test by reporting and
interpreting the P-value
12Example Are Astrologers Predictions Better
Than Guessing?
- Step 1 Assumptions
- The data is categorical each prediction falls
in the category correct or incorrect
prediction - Each subject was identified by a random number.
Subjects were randomly selected for each
experiment. - np116(1/3) 15
- n(1-p) 116(2/3) 15
13Example Are Astrologers Predictions Better
Than Guessing?
- Step 2 Hypotheses
- H0 p 1/3
- Ha p 1/3
14Example Are Astrologers Predictions Better
Than Guessing?
- Step 3 Test Statistic
- In the actual experiment, the astrologers were
correct with 40 of their 116 predictions (a
success rate of 0.345)
15Example Are Astrologers Predictions Better Than
Guessing?
- Step 4 P-value
- The P-value is 0.40
16Example Are Astrologers Predictions Better Than
Guessing?
- Step 5 Conclusion
- The P-value of 0.40 is not especially small
- It does not provide strong evidence against H0 p
1/3 - There is not strong evidence that astrologers
have special predictive powers
17How Do We Interpret the P-value?
- A significance test analyzes the strength of the
evidence against the null hypothesis - We start by presuming that H0 is true
- The burden of proof is on Ha
18How Do We Interpret the P-value?
- The approach used in hypotheses testing is called
a proof by contradiction - To convince ourselves that Ha is true, we must
show that data contradict H0 - If the P-value is small, the data contradict H0
and support Ha
19Two-Sided Significance Tests
- A two-sided alternative hypothesis has the form
Ha p ? p0 - The P-value is the two-tail probability under the
standard normal curve - We calculate this by finding the tail probability
in a single tail and then doubling it
20Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Study investigate whether dogs can be trained
to distinguish a patient with bladder cancer by
smelling compounds released in the patients urine
21Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Experiment
- Each of 6 dogs was tested with 9 trials
- In each trial, one urine sample from a bladder
cancer patient was randomly place among 6 control
urine samples
22Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Results
- In a total of 54 trials with the six dogs, the
dogs made the correct selection 22 times (a
success rate of 0.407)
23Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Does this study provide strong evidence that the
dogs predictions were better or worse than with
random guessing?
24Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Step 1 Check the sample size requirement
- Is the sample size sufficiently large to use the
hypothesis test for a population proportion? - Is np0 15 and n(1-p0) 15?
- 54(1/7) 7.7 and 54(6/7) 46.3
- The first, np0 is not large enough
- We will see that the two-sided test is robust
when this assumption is not satisfied
25Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Step 2 Hypotheses
- H0 p 1/7
- Ha p ? 1/7
26Example Dr Dog Can Dogs Detect Cancer by
Smell?
27Example Dr Dog Can Dogs Detect Cancer by
Smell?
28Example Dr Dog Can Dogs Detect Cancer by
Smell?
- Step 5 Conclusion
- Since the P-value is very small and the sample
proportion is greater than 1/7, the evidence
strongly suggests that the dogs selections are
better than random guessing
29Summary of P-values for Different Alternative
Hypotheses
30The Significance Level Tells Us How Strong the
Evidence Must Be
- Sometimes we need to make a decision about
whether the data provide sufficient evidence to
reject H0 - Before seeing the data, we decide how small the
P-value would need to be to reject H0 - This cutoff point is called the significance
level
31The Significance Level Tells Us How Strong the
Evidence Must Be
32Significance Level
- The significance level is a number such that we
reject H0 if the P-value is less than or equal to
that number - In practice, the most common significance level
is 0.05 - When we reject H0 we say the results are
statistically significant
33Possible Decisions in a Test with Significance
Level 0.05
34Report the P-value
- Learning the actual P-value is more informative
than learning only whether the test is
statistically significant at the 0.05 level - The P-values of 0.01 and 0.049 are both
statistically significant in this sense, but the
first P-value provides much stronger evidence
against H0 than the second
35Do Not Reject H0 Is Not the Same as Saying
Accept H0
- Analogy Legal trial
- Null Hypothesis Defendant is Innocent
- Alternative Hypothesis Defendant is Guilty
- If the jury acquits the defendant, this does not
mean that it accepts the defendants claim of
innocence - Innocence is plausible, because guilt has not
been established beyond a reasonable doubt
36One-Sided vs Two-Sided Tests
- Things to consider in deciding on the alternative
hypothesis - The context of the real problem
- In most research articles, significance tests use
two-sided P-values - Confidence intervals are two-sided
37The Binomial Test for Small Samples
- The test about a proportion assumes normal
sampling distributions for and the z-test
statistic. -
- It is a large-sample test the requires that the
expected numbers of successes and failures be at
least 15. In practice, the large-sample z test
still performs quite well in two-sided
alternatives even for small samples. - Warning For one-sided tests, when p0 differs
from 0.50, the large-sample test does not work
well for small samples
38 Section 8.3
- Significance Tests about Means
39What Are the Steps of a Significance Test about a
Population Mean?
- Step 1 Assumptions
- The variable is quantitative
- The data are obtained using randomization
- The population distribution is approximately
normal. This is most crucial when n is small and
Ha is one-sided.
40What Are the Steps of a Significance Test about a
Population Mean?
- Step 2 Hypotheses
- The null hypothesis has the form
- H0 µ µ0
- The alternative hypothesis has the form
- Ha µ µ0 (one-sided test) or
- Ha µ
- Ha µ ? µ0 (two-sided test)
41What Are the Steps of a Significance Test about a
Population Mean?
- Step 3 Test Statistic
- The test statistic measures how far the sample
mean falls from the null hypothesis value µ0
relative to what wed expect if H0 were true - The test statistic is
42What Are the Steps of a Significance Test about a
Population Mean?
- Step 4 P-value
- The P-value summarizes the evidence
- It describes how unusual the data would be if H0
were true
43What Are the Steps of a Significance Test about a
Population Mean?
- Step 5 Conclusion
- We summarize the test by reporting and
interpreting the P-value
44Summary of P-values for Different Alternative
Hypotheses
45Example Mean Weight Change in Anorexic Girls
- A study compared different psychological
therapies for teenage girls suffering from
anorexia - The variable of interest was each girls weight
change weight at the end of the study
weight at the beginning of the study
46Example Mean Weight Change in Anorexic Girls
- One of the therapies was cognitive therapy
- In this study, 29 girls received the therapeutic
treatment - The weight changes for the 29 girls had a sample
mean of 3.00 pounds and standard deviation of
7.32 pounds
47Example Mean Weight Change in Anorexic Girls
48Example Mean Weight Change in Anorexic Girls
- How can we frame this investigation in the
context of a significance test that can detect a
positive or negative effect of the therapy? - Null hypothesis no effect
- Alternative hypothesis therapy has some
effect
49Example Mean Weight Change in Anorexic Girls
- Step 1 Assumptions
- The variable (weight change) is quantitative
- The subjects were a convenience sample, rather
than a random sample. The question is whether
these girls are a good representation of all
girls with anorexia. - The population distribution is approximately
normal
50Example Mean Weight Change in Anorexic Girls
- Step 2 Hypotheses
- H0 µ 0
- Ha µ ? 0
51Example Mean Weight Change in Anorexic Girls
52Example Mean Weight Change in Anorexic Girls
- Step 4 P-value
- Minitab Output
- Test of mu 0 vs not 0
- Variable N Mean StDev SE Mean
wt_chg 29 3.000 7.3204 1.3594 CI - 95 CI T P
- (0.21546, 5.78454) 2.21 0.036
53Example Mean Weight Change in Anorexic Girls
- Step 5 Conclusion
- The small P-value of 0.036 provides considerable
evidence against the null hypothesis (the
hypothesis that the therapy had no effect)
54Example Mean Weight Change in Anorexic Girls
- The diet had a statistically significant
positive effect on weight (mean change 3
pounds, n 29, t 2.21, P-value 0.04) - The effect, however, may be small in practical
terms - 95 CI for µ (0.2, 5.8) pounds
55Results of Two-Sided Tests and Results of
Confidence Intervals Agree
- Conclusions about means using two-sided
significance tests are consistent with
conclusions using confidence intervals - If P-value 0.05 in a two-sided test, a 95
confidence interval does not contain the H0 value - If P-value 0.05 in a two-sided test, a 95
confidence interval does contain the H0 value
56What If the Population Does Not Satisfy the
Normality Assumption
- For large samples (roughly about 30 or more) this
assumption is usually not important - The sampling distribution of x is approximately
normal regardless of the population distribution
57What If the Population Does Not Satisfy the
Normality Assumption
- In the case of small samples, we cannot assume
that the sampling distribution of x is
approximately normal - Two-sided inferences using the t distribution are
robust against violations of the normal
population assumption - They still usually work well if the actual
population distribution is not normal
58Regardless of Robustness, Look at the Data
- Whether n is small or large, you should look at
the data to check for severe skew or for severe
outliers - In these cases, the sample mean could be a
misleading measure