Title: Hypothesis Testing Using a Single Sample
1Chapter 10
- Hypothesis Testing Using a Single Sample
2Chapter 10 Hypothesis Testing
- An Example A report released by the National
Association of Colleges and Employers stated that
the average salary for students graduating in
2006 with a degree in accounting was 45,656. - Suppose that you are interested in
investigating whether the mean starting salary
for students graduating with an accounting degree
from your university this year is gt 45,656, a
sample of n 40 accounting graduates from the
university was selected, and it was found that
the mean starting salary is 45,958 with a
standard deviation of 1214. - Is µ gt 45,656 a reasonable conclusion?
310.1 Hypotheses and Test Procedures
- A test of hypotheses or test procedures is a
method for using sample data to decide between
two competing claims (hypotheses) about a
population characteristic. - One hypothesis µ 45,656, and the other µ?
45,656 - One hypothesis µ 45,656, and the other may be
µ gt 45,656. - We initially assume that a particular hypothesis
(the null hypothesis) is the correct one. Then we
consider the sample data, and if there is
convincing evidence against the null hypothesis,
we reject the null hypothesis in favor of the
competing hypothesis.
4Null and Alternative Hypotheses
- The null hypothesis, denoted by H0, is a claim
about a population characteristic that is
initially assumed to be true. - The alternative hypothesis, denoted by Ha, is the
competing claim. - The two possible conclusions are
- reject H0 if sample evidence strongly suggests
that H0 is false or - fail to reject H0 if the sample does not contain
such evidence. - Similar to US judicial system
- H0 The defendant is innocent versus
- Ha The defendant is guilty
- Rejecting H0 means that the jury
find the defendant guilty, while failing to
reject H0 means that the jury find the defendant
not guilty.
5- The form of a null hypothesis is
- H0 population characteristic hypothesized
value - where the hypothesized value is a specific
number determined by the problem context. - The alternative hypothesis has one of the
following three forms - Ha population characteristic gt hypothesized
value - Ha population characteristic lt hypothesized
value - Ha population characteristic ? hypothesized
value
6- Example Tennis Ball Diameter
- Because of variation in the manufacturing
process, tennis balls produced by a particular
machine do not have identical diameters. Suppose
the machine was originally calibrated to achieve
the design specification of 3 in diameter.
However, the manufacturer is now concerned that
the diameters no longer conform to this
specification. What is the sensible choice of
hypotheses? -
H0 µ 3 (the specification is being met, so
recalibration is unnecessary Ha µ ? 3 (the
specification is not being met, so recalibration
is necessary)
7- Example Light Bulb Lifetimes
- Kmart brand 60-W light bulbs state on the
package Ave. Life 1000 Hr. People who purchase
this brand would be unhappy if the actual life
time is less than the advertised 1000 hours.
Suppose a sample of Kmart light bulbs is selected
and the lifetime for each bulb in the sample is
recorded. Choose the null and alternative
hypotheses for the test. - People who purchase this brand would be unhappy
if µ is actually less than 1000 hours. A sample
of Kmart light bulbs is selected and the lifetime
for each bulb in the sample is recorded. The
sample results can then be used to test the
hypothesis µ 1000 hours against the hypothesis
µ lt 1000.
H0 µ 1000 Ha µ lt 1000
H0 will be rejected only if sample
evidence strongly suggests that µ 1000 is not
plausible.
810.2 Errors in Hypothesis Testing
- Just like a jury may reach the wrong
verdict in a trial, there is some chance that
using a test procedure with sample data may lead
us to wrong conclusion about the population
characteristics. - Two types of error that might be made using a
hypothesis test
9- Example On-Time Arrival The US Department of
Transportation reported that during a recent
period, 78.6 of all domestic passenger flights
arrived on time. Suppose that an airline decides
to offer its employees a bonus, in an upcoming
month, if the airlines proportion of on-time
flights exceeds the overall industry rate of
0.786. Let p be the true proportion of the
airlines flights that are on time during the
month. Set a hypothesis test for p and discuss
the types of error. - Solution H0 p .786 Ha p gt .786
- Type I error (Reject a true H0) The Airlines
reward its employees when in fact their true
proportion of on-time flights did not exceed
78.6 - Type II error (Fail to reject a false H0) The
airlines employees do not receive a reward that
in fact they deserved.
10- Example Slowing the Growth of Tumors
- A pharmaceutical company issued a press release
announcing that it had filed an application with
the Food and Drug Administration to begin
clinical trials of an experimental drug that had
been found to reduce the growth rate of
pancreatic and colon cancer tumors in animal
studies. - Let µ denote the true mean growth rate of
tumors for patients receiving the experimental
drug. Set up a hypotheses testing for µ. - Solution H0 µ mean growth rate of tumors
for patients not taking the experimental drug
(The drug is not effective) versus - Ha µ lt mean growth rate of tumors
for patients not taking the experimental drug
(The growth rate is reduced and the drug is
effective.) - Type I error (Reject a true H0) Incorrectly
conclude that the drug is effective in slowing
the growth rate of tumors. - Type II error (Fail to reject a false H0)
Conclude that the experimental drug is
ineffective when in fact it does reduce the mean
growth rate of tumors.
11Definition Level of Significance
- The probability of a Type I error is denoted by a
and is called the level of significance of the
test. - For example, a test with a .01 is said to have
a level of significance of .01. - The probability of a Type II error is denoted by
ß. - After assessing the consequences of Type I and
Type II errors, identify the largest a that is
tolerable for the problem. - Dont make the a smaller than it needs to be.
- Smaller a increases ß.
12Example Blood Test for Ovarian Cancer
- A new blood test has been developed that
appears to be able to identify ovarian cancer at
its earliest stage. In a report issued by NCI and
FDA the following information is given - The test was given to 50 women known to have
ovarian cancer, and it correctly identified all
of them as having cancer. - The test was given to 66 women known not to
have ovarian cancer, and it correctly identified
63 of them as being cancer free. - Give an estimate of a and ß.
Solution on next slide
13Solution to the Example Blood Test for Ovarian
Cancer
H0 woman has ovarian cancer Ha woman does not
have ovarian cancer Type I error (reject H0
while it is in fact true) Believing that a
woman with ovarian cancer is cancer free. Type II
error (fail to reject H0 while it is actually
false) Believing that a woman who is actually
cancer free has ovarian cancer. Based on the
preliminary evaluation of the blood text, The
probability of a Type I error a 0/50 0. (50
women with ovarian cancer were all correctly
identified. No error.) The probability of a Type
II error ß 3/66 .046 (Among 66 women known
not to have ovarian cancer, only 3 of them were
wrongly identified as having ovarian cancer.)
14- Example Lead in Tap Water.
- Drinking water is considered unsafe if the mean
concentration of lead is 15 ppb (parts per
billion) or greater. The Environmental Protection
Agency (EPA) had indicates that 6 of public
water systems contained too much lead, and
requires the communities to take corrective
actions. One of the cited communities wants to
conduct a hypotheses testing for the mean
concentration of lead in its tap water. - Let µ the mean concentration of lead in the
communitys water system. - H0 µ 15 (mean lead concentration excessive
by EPA standard) - versus Ha µ lt 15 (mean lead concentration at
acceptable level) - Type I error (Reject a true H0) To conclude that
the water source meets EPA standards for lead
when in fact it does not. - Type II error (Fail to reject a false H0) To
conclude that the water does not meet EPA
standards when in fact it does. - Type I error results in potentially serious
public health risk. A small value of a such as a
0.01 could be selected. Of course, smaller a
increases the risk of a Type II error, which also
has serious consequences for the community
(losing its water source).
1510.4 Hypothesis Tests for a Population Mean
- A test statistic is the function of sample data
on which a conclusion to reject or fail to reject
H0 is based - The P-value (observed significance level) is a
measure of inconsistency between the hypothesized
value for a population characteristic and the
observed sample. It is the probability, assuming
that H0 is true, of obtaining a test statistic
value at least as inconsistent with H0 as what
actually resulted. - H0 should be rejected if P-value a.
- H0 should not be rejected if P-value gt a.
16Finding P-Values for a t-Test (One-Tail)
17Finding P-Values for a t-Test (Two-Tailed)
- Find P-Value for the following t test.
- Upper-tailed test, df 25, t 2.0.
- Lower-tailed test, n 28, t - 4.2.
- Two-tailed test, df 40, t 1.7.
Solutions on next slide
18Find P-Values for a t-Test Using Appendix Table
4 Tail Areas for t-Curves (page 709 page 711)
- The next slide contains a sample page of Appendix
Table 4 (the last page). Use the table to find
the P-values. - Solutions to Problems 1, 2, and 3 on the
preceding slide. - P-Value (an upper-tailed t-test ) The area
under the 25-df t curve to the right of 2.0
The value in the t 2.0 row under the column
25-df in Appendix Table 4 .028. - P-Value (a lower-tailed t-test) The area under
the 27-df t curve to the left of -4.2 The area
under the 27-df t curve to the right of 4.2
The value in the t 4.2 row under the column
27-df in Appendix Table 4 0. (4.2 gt 4.0, the
largest t value in the table.) - P-Value (a two-tailed t-test) 2 The area
under the 40-df t curve to the right of 1.7 2
The value in the t 1.7 row under the column
40-df in Appendix Table 4 2 .048 .096.
19(No Transcript)
20t-Test for A Population Mean
- Assumption The sample size is large (generally n
30) or at least the population distribution is
at least approximately normal. - Null Hypothesis H0 µ hypothesized value.
- Test statistic
- Alternative Hypothesis P-Value
- Ha µ gt hypothesized value area in upper tail
- Ha µ lt hypothesized value area in lower tail
- Ha µ ? hypothesized value sum of area in two
tails - Conclusion
- Reject H0 if P-value Significance level a, or
P-value 0 (usually lt .01) even if there is no
significance level given.
21Example Time Stands Still (or So It Seems)
- A study conducted by researchers at PSU
investigated whether time perception, a simple
indication of a persons ability to concentrate,
is impaired during nicotine withdrawal. After a
24 hours smoking abstinence, 20 smokers were
asked to estimate how much time had passed during
a 45-sec period. The resulting data on perceived
elapsed time (in seconds) were shown on the
right. - The researchers wanted to determine
whether smoking abstinence had a negative impact
on time perception, causing elapsed time to be
overestimated. - Solution Assumptions The sample size is only
20, we must be willing to assume that the
population distribution of perceived times is at
least approximately normal. The boxplot is not
too skewed and there is no outliers, so our
assumption is reasonable.
Continued on next slide
22Solution to the Example Time Stands Still
Let µ mean perceived elapsed time for
smokers who have abstained from smoking for 24
hours. From the data, we can find
- The researchers want to know if smoking
abstinence caused elapsed time to be
overestimated, so we use an upper-tailed test
with significance level a .05. - Null and Alternative hypotheses H0 µ 45 Ha
µ gt 45
P-value df 20 - 1 19. Use the df 19 column
of Appendix Table 4 to find P-value area to
the right of 6.50 0 (because 6.50 gt 4.0, the
largest tabulated value, and the area to the
right of 4.0 is already 0). Conclusion Reject H0
because P-value lt a. There is convincing evidence
that the mean perceived time elapsed is
overestimated.
23Example Goofing Off at Work
- A growing concern of employers is time spent in
activities like surfing the Internet and emailing
friends during work hours. Suppose a CEO of a
large corporation wants to determine if her
employees spend less than 120 minutes a day on
personal use of company technology. A random
sample of 10 employees was contacted and the
resulting data (in minutes per day) are listed
below. -
Does the data provide evidence that the
mean wasted time for the company is less than 120
min? Carry out a hypothesis test with a 0.05.
Solution Assumptions The sample size is only
10, we must be willing to assume that the
population distribution of wasted times is at
least approximately normal. Although the boxplot
reveals some skewness in the sample, there are no
outliers.
Continued on next slide
24Solution to the Example Goofing Off at Work
Let µ mean daily wasted time for
employees of the company. Summary quantities of
the data are
- The CEO wants to determine whether the mean
wasted time for the company is less than 120
hours, so we use a lower-tailed test with a
.05. - Null and Alternative hypotheses H0 µ
120 Ha µ lt 120
P-value From the df 9 column of Appendix
Table 4, we find P-value area to the left of
-1.1 area to the right of 1.1 0.150
Conclusion Fail to reject H0 because P-value gt
a. There is no sufficient evidence to conclude
that the mean wasted time at the company is less
than 120 hours.
25Example pH Level of Water
- An chemical plant claims that the mean pH level
of the water in a nearby river is 6.8. You
randomly select 19 water samples and measure the
pH value of each. The sample mean is 6.7 and
standard deviation is 0.24. Is there enough
evidence to reject the chemical plants claim at
a 0.05? Assume the population is normally
distributed. - Solution The sample size is only 19, and
therefore, the assumption is necessary that the
population distribution is normal. - To test the claim that the mean pH level is
6.8, we use a two-tailed test with significance
level a 0.05.
26Solution to the Example Goofing Off at Work
Let µ mean pH level in the nearby
river. The sample mean and standard are already
computed
- Null and Alternative hypotheses H0 µ 6.8 Ha
µ ? 6.8
P-value From the df 19 - 1 18 column of
Appendix Table 4, we find P-value 2 area to
the right of 1.8 2 0.044 0.088 gt a
Conclusion Fail to reject H0 because P-value gt
a. There is no sufficient evidence at the 5
level of significance to reject the claim that
the mean pH level in the river is 6.8.
27Exercise Cricket Love
- The usual chirp rate for male field crickets
varied around a mean of 60 chirps per second. To
investigate if chirp rate was related to
nutritional status, investigators fed 32 male
crickets on the high protein diet for 8 days and
found that the mean chirp rate for these crickets
was 109 chirps per second with a standard
deviation s 40. Is this convincing evidence
that the mean chirp rate for crickets on a high
protein diet is greater than 60 chirps per
second? Carry out a hypothesis test with a
significance level a .01.
Answer There is convincing evidence that the
mean chirp rate is higher for male crickets that
eat a high protein diet.