Title: Chapter 21: More About Tests
1Chapter 21More About Tests
- The wise man proportions his belief to the
evidence. - -David Hume 1748
2The Null Hypothesis
- The null must be a statement about the value of a
parameter from a model - The value for the parameter in the null
hypothesis is found within the context of the
problem - Use this value to compute the probability that
the observed sample statistic would occur - The appropriate null arises from the context of
the problem - Think about the WHY of the situation
3Another One-Proportion z-Test
- Parameter the proportion of successful
identifications
- Null the therapeutic touch practitioners are
just guessing, so theyll succeed about half the
time. - A one-sided test seems appropriate
4Another One-Proportion z-Test
- Check the conditions
- Independence
- Randomization
- 10 condition
- Success/failure
- Independence the hand choice was randomly
selected, so the trials should be independent - Randomization the experiment was randomized by
flipping a coin - 10 condition the experiment observes some of
what could be an infinite number of trials - Success/failure
5Another One-Proportion z-Test
- Because the conditions are satisfied, it is
appropriate to model the sampling distribution of
the proportion with the model - We can perform a one-proportion z-test
- State the null model
- Name the test
6Another One-Proportion z-Test
- Find the standard deviation of the sampling model
using the hypothesized proportion,
7Another One-Proportion z-Test
- Sketch of Normal model
- Find the z-score
- Find the P-value
8Another One-Proportion z-Test
- Conclusion
- Link the P-value to your decision about the null
hypothesis - State your conclusion in context
- If possible, state a course of action
- If the true proportion of successful detections
of a human energy field is 50, then an observed
proportion of 46.7 successes or more would occur
at random about 80 of the time. - That is not a rare event, so we do not reject the
null hypothesis - There is insufficient evidence to conclude that
the practitioners are performing better than they
would have by guessing.
9P-values
- A P-value is a conditional probability
- A P-value is the probability of the observed
statistic given that the null hypothesis is true - The P-value is not the probability that the null
hypothesis is true - A small P-value tells us that our data are rare
given the null hypothesis
10Alpha Levels
- Alpha level
- An arbitrarily set threshold for our P-value
- Also called the significance level
- Must be selected prior to looking at the data
- If our P-value falls below that point, well
reject the null hypothesis - The result is called statistically significant
- When we reject the null hypothesis, we say that
the test is significant at that level - Common alpha levels .10, .05, .01
11Therapeutic Touch Revisited
- The P-value was .7929
- This is well above any reasonable alpha level
- Therefore, we cannot reject the null hypothesis.
- Conclusion we fail to reject the null
hypothesis. There is insufficient evidence to
conclude that the practitioners are performing
better than if they were just guessing
12Absolutes Are You Uncomfortable?
- Reject/fail to reject decision when we use an
alpha level is absolute - If your P-value falls just slightly above the
alpha level, you do not reject the null
hypothesis. However, if your P-value falls just
slightly below, you do reject the null hypothesis - Perhaps it is better to report the P-value as an
indicator of the strength of the evidence when
making a decision
13Statistically Significant
- We mean that the test value has a P-value lower
than our alpha level - For large samples, even small deviations from the
null hypothesis can be statistically significant - When the sample is not large enough, even very
large differences may not be statistically
significant - Report the magnitude of the difference between
the statistic and the null hypothesis when
reporting the P-value
14Critical Values Again
- Critical values can be used as a shortcut for the
hypothesis tests - Check your z-score against the critical values
- Any z-score larger in magnitude than a particular
critical value has to be less likely, so it will
have a P-value smaller than the corresponding
probability
15TT Revisited Again
- A 90 confidence interval would give
- We could not reject because
50 is a plausible value for the practitioners
true success - Any value outside the confidence interval would
make a null hypothesis that we would reject wed
feel more strongly about values far outside the
interval
16Confidence Intervals Hypothesis Tests
- Confidence intervals and hypothesis tests have
the same assumptions and conditions - Because confidence intervals are naturally
two-sided, they correspond to two-sided tests - A confidence interval with a confidence level of
C corresponds to a two-sided hypothesis test
with an ? level of 100 C - A confidence interval with a confidence of C
corresponds to a on-sided hypothesis test with an
? level of ½ (100 C)
17Click It or Ticket
- If there is evidence that fewer than 80 of
drivers are buckling up, campaign will continue
- Check conditions
- Independence Drivers are not likely to influence
each others seatbelt habits - Randomization we can assume that the drivers are
representative of the driving public - 10 Police stopped fewer than 10 of drivers
- Success/Failure there were 101 successes and 33
failures both are greater than 10. The sample is
large enough
Use a one-proportion z-interval
18Click It or Ticket
- To test the one-tailed hypothesis at the 5 level
of significance, construct a 90 confidence
interval - Determine the standard error of the sample
proportion and the margin of error
19Click It or Ticket Conclusion
- We can be 90 confident that between 69 and 81
of all drivers wear their seatbelts. - Because the hypothesized rate of 80 is within
this interval, we cannot reject the null
hypothesis. - There is insufficient evidence to conclude that
fewer than 80 of all drivers are wearing
seatbelts.
20Making Errors
- The null hypothesis is true, but we reject it.
- The null hypothesis is false, but we fail to
reject it.
- When we perform a hypothesis test, we can make
mistakes in two ways
21Type I Errors
- Type I errors occur when the null hypothesis is
true but weve had the bad luck to draw an
unusual sample. - To reject HO, the P-value must fall below ?.
- When you choose level ?, youre setting the
probability of a Type I error.
22Type II Errors
- When HO is false, and we fail to reject it, we
have made a Type II error (?). - There is no single value for ?. We can compute
the probability ? for any parameter value in HA. - Think about effect how big a difference would
matter?
23Type I vs. Type II
- We can reduce ? for all values in the
alternative, by increasing ?. - If we make it easier to reject the null
hypothesis, were more likely to reject it
whether its true or not - However, we would make more Type I errors
- The only way to reduce both types of errors is to
collect more data. (Larger sample size)
24Power
- Our ability to detect a false hypothesis is
called the power of a test. - When the null hypothesis is actually false, we
want to know the likelihood that our test is
strong enough to reject it. - The power of a test is the probability that it
correctly rejects a false hypothesis.
25Power
- ? is the probability that a test fails to reject
a false hypothesis, so the power of a test is - The value of power depends on how far the truth
lies from the null hypothesis value. - The distance between the null hypothesis value,
pO, and the truth, p, is the effect size
26What Can Go Wrong???
- Dont change the null hypothesis after you look
at the data. - Dont base your alternative hypothesis on the
data. - Dont make what you want to show into your null
hypothesis - Dont interpret the P-value as the probability
that HO is true
27What Can Go Wrong???
- Dont believe too strongly in arbitrary alpha
values - Dont confuse practical and statistical
significance - Despite all precautions, errors (Type I or II)
may occur - Always check the conditions