Math 3680 - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Math 3680

Description:

Math 3680 Lecture #7 The Sign Test and the Binomial Exact Test – PowerPoint PPT presentation

Number of Views:190
Avg rating:3.0/5.0
Slides: 40
Provided by: unt89
Learn more at: http://www.math.unt.edu
Category:
Tags: immobility | math

less

Transcript and Presenter's Notes

Title: Math 3680


1
Math 3680 Lecture 7 The Sign Test and
the Binomial Exact Test
2
The Sign Test
3
  • Example For this data set, there are 14 pairs in
    which there is a difference in the two measured
    amounts. Let K the number of pairs in which
    the first method returned a higher amount. We
    choose a 0.05. We observe that there are 12
    pairs where the first method returns a higher
    value. This gives the value of the test statistic
    ks 12.
  • 79.2 74.0 96.8 95.8
  • 105.8 97.8 76.0 75.0 99.2 98.0
  • 99.5 96.2 69.5 67.5 99.2 99.0
  • 100.0 101.8 23.5 21.2 91.0 100.2
  • 93.8 88.0 95.2 94.8 72.0 67.5

4
  • Lets play devils advocate and assume the null
    hypothesis is correct. That is, lets assume
    that p 0.5, and lets work through the logical
    ramifications of this assumption.
  • If the null hypothesis is correct, whats the
    probability of obtaining a value of K at least
    as extreme as the observed test statistic? This
    probability is called the P-value, or the
    observed level of significance

Excel 1 - BINOMDIST(11,14,0.5,1)
5
  • In other words, if we assume the null hypothesis,
    we also have to accept that fact that theres
    less than 1 chance in 150 of obtaining a test
    statistic this large or larger.
  • We then ask the question which is more
    plausible? In this case, the P-value is less than
    the stipulated significance level of a 0.05.
    Its more plausible to reject the null hypothesis
    in favor of the alternative hypothesis.
  • Conclusion (written in plain English)
  • We reject the null hypothesis. There is good
    reason to believe that the first method returns a
    higher amount than the second method.

6
  • Summary of algorithm for hypothesis testing
  • H0 The first method does not return higher
    values than the second method (p 0.5)
  • Ha The first method returns higher values than
    the second method (p gt 0.5) (one-sided test)
  • We choose a 0.05
  • Test statistic ks 12 for a sample size of n
    14
  • P-value 0.00646973
  • Conclusion We reject the null hypothesis. There
    is good reason to believe that the first method
    returns a higher amount than the second method.

7
  • Notes.
  • Note 1. Notice we have not proven beyond a
    shadow of a doubt that the first method returns a
    higher value than the second method. Is it
    possible for 14 fair coins to land so that 12 or
    more are heads? Yes. In other words, we may
    simply have had a run of luck.
  • However, we can reasonably justify our rejection
    of the null hypothesis.

8
  • Note 2. The alternative hypothesis is p gt 0.5.
    It is not that   p 12/14. Good practice is to
    state the null and alternative hypotheses (and
    select a ) before looking at the data.
  • Note 3. Small P-values are evidence against the
    null hypothesis they indicate that something
    besides chance is at work.

9
  • Note 4. If P lt 5, the result is often called
    statistically significant.
  • If P lt 1, the result is called highly
    statistically significant.
  • These phrases are often used in media reports on
    scientific progress especially breakthroughs in
    medical research.

10
After dropping for years, teen smoking in the
U.S. has leveled off Monday, June 12, 2006
Posted 1014 a.m. EDT (1414 GMT) ATLANTA,
Georgia (AP) -- The long, steady decline in teen
smoking in the United States since the late
1990s appears to have come to a standstill,
health officials said Friday. A survey released
this week showed that smoking among high school
students held steady at around one in four
teenagers between 2003 and 2005. Two other
surveys in the past year or so found that teen
smoking has apparently plateaued since 2002. "We
were making good progress, and now it looks like
we're not," said Dr. Corinne Husten, acting
director of the Office on Smoking and Health at
the Centers for Disease Control and
Prevention. The trend was outlined in the CDC's
National Youth Risk Behavior Survey, which is
conducted every other year and involves about
14,000 high school students across the country.
The results of the latest survey were released
last week. The survey had been showing a steady
and pronounced decline in youth smoking since
1997, when more than 36 percent of students said
they had smoked in the previous 30 days. The
percentages dropped to about 35 in 1999, 28.5 in
2001 and 22 in 2003. But when students were
asked the question last spring, 23 percent said
they had smoked. The increase from the 2003
survey was not considered statistically
significant, but it was disturbing news, health
advocates said.
11
Study shows fliers are out of
breath Wednesday, April 27, 2005 Posted 723 AM
EDT (1123 GMT) LONDON, England -- Airline
passengers are putting up with "significant"
drops in the supply of oxygen while flying at
high altitude, according to researchers. Just
over half of all fliers analyzed had oxygen
levels 6 percent lower than usual when the
airplane was at maximum altitude -- a level at
which doctors normally administer extra oxygen
for hospital patients. "We believe that these
falling oxygen levels, together with factors such
as dehydration, immobility and low humidity,
could contribute to illness during and after
flights," said Susan Humphreys of the Royal
Group of Hospitals in Belfast, whose group
conducted the research. "This has become a
greater problem in recent years as modern
airplanes are able to cruise at much higher
altitudes." A drop in oxygen levels can be a
contributing factor to deep vein thrombosis
(DVT), a potentially fatal blood clot which is
also called "economy class syndrome." Low oxygen
levels also can lead to headaches, fatigue and
impaired mental performance.
12
"We should be giving people with ill health more
advice about things they can do, such as
drinking more water when they fly, to avoid
problems," researcher Rachel Deyermond told the
UK's Daily Telegraph newspaper. The researchers
from Belfast, Northern Ireland published their
results in the May issue of Anaesthesia, a
British medical journal. They recorded the blood
oxygen levels and the pulse rate of 84
passengers, aged 1 to 78, at both ground level
and at peak altitude during a flight. The
research shows a "statistically significant"
reduction in oxygen levels in all passengers
traveling on both long- and short-haul
flights. On average, oxygen levels in passengers
dropped by 4 percent by the time the plane had
reached cruising altitude. A total of 54 percent
of passengers had oxygen levels below this
level. Of the 84 passengers who were analyzed,
55 were on flights lasting more than two hours,
while the rest were on short-haul journeys.
Similar results were obtained from both groups.
None of them had severe cardio-respiratory
problems or required permission from their
doctor to fly.
13
(No Transcript)
14
  • Note 5. We are NOT saying that there is 1 chance
    in 150 for the null hypothesis to be correct.
  • Instead, the P-value is used as a tool to
    determine whether or not to reject the null
    hypothesis.
  • Note 6. The significance level a should be
    chosen before inspecting the data. Seeing the
    evidence before deciding on the value of a is
    called data snooping, which may bias our decision.

15
  • Note 7. When computing the P-value, we found
    P(K 12) and not P(K 12). The idea is
    that, assuming the null hypothesis is true, we
    want to compute the probability of getting an
    observed value either this extreme or even more
    extreme.
  • Why does this makes sense? Suppose a fair coin is
    flipped 1000 times and lands heads 501 times. We
    should retain the null hypothesis, and the chance
    of getting 501 or more heads is quite large
    (48.7). However, the chance of getting exactly
    501 heads is very small (2.5) using the latter
    figure would have led us to incorrectly reject
    the null hypothesis.

16
  • Example Ten children (ages 8 to 14) with a
    history of severe learning and behavioral
    disorders were recruited for a six-week study.
    For three weeks, each child was given a placebo
    for the other three weeks, each child was given
    ethosuximide, widely prescribed for epilepsy.
    Five of the children received the placebo first
    the other five received the placebo last.
  • After each three-week period, each child was
    given an IQ test. The table (P/E) shows the two
    verbal IQ scores for each child. Was the
    medication effective for increasing IQ scores?
  • 97 113 102 111 104 106
  • 106 113 111 122 90 110
  • 106 101 115 121 96 126
  • 95 119

17
  • Solution.
  • H0 The IQ scores after ethosuximide were the
    same as the scores after placebo (p 0.5)
  • Ha The IQ scores after ethosuximide were
    different than the scores after placebo (p ? 0.5)
    (two-sided test)
  • We choose a 0.05
  • Before continuing, why isnt Ha written as p gt
    0.5?

18
  • Test statistic ks 9 for a sample size of n
    10.
  • P-value. Assuming H0, we must find the chance of
    obtaining a test statistic at least this extreme.
    For this problem, that means (why?)

In Excel BINOMDIST(1,10,0.5,1) 1 -
BINOMDIST(8,10,0.5,1)
19
  • Conclusion We reject the null hypothesis. There
    is good reason to believe that the ethosuximide
    does effect verbal IQ scores.

20
  • Notes
  • Note 8. The form of the alternative hypothesis,
    which is based on the context of the problem,
    determines how the P-value is computed.

21
  • Secondhand smoke is classified as a known
    carcinogen by the Environmental Protection Agency
    (EPA). This classification is based on many
    scientific studies which investigated the
    question of whether secondhand smoke was
    associated with a higher incidence of cancer.
  • The EPA conducted its study using a 5
    significance level and a one-tailed test. A
    one-tailed test was used because it was already
    independently determined that first-hand smoke
    caused cancer and the preliminary studies
    indicated that second-hand smoke was a probable
    cause of cancer. However, the tobacco industry
    argued that a one-tailed test was inappropriate
    and that a two-tailed test should be used. They
    claimed that by using a one-tailed test at the 5
    significance level, the EPA was essentially using
    a two-tailed test at the 10 significance level,
    since each tail would then have area of 5. The
    tobacco industry argued that this doubled the
    probability of a type I error.
  • Nevertheless, since there was good reason to
    think that secondhand smoke was a carcinogen, the
    EPA followed the usual scientific convention of
    using a one-tailed test. Reference Secondhand
    Smoke Is it a Hazard?, Consumer Reports,
    January 1995

22
Testing a Population Median
23
  • The sign test may also be used as a test for the
    value of a population median. Recall the
    definition of a median half the data should lie
    below the median, while the other half lies
    above.

24
  • Example A bank will open a new branch in a
    community only if it can be established that the
    median family income in the community is greater
    than 50,000. To obtain information, a random
    sample of 75 families is chosen. Of these, 44 had
    incomes over 50,000, while the other 31 had
    incomes below 50,000.
  • Is this information statistically significant to
    establish that the median family income is more
    than 50,000?

25
  • Solution.
  • H0 The median income is 50,000 (or less) (m ?
    50,000)
  • Ha The median income is more than 50,000 (m gt
    50,000)
  • Alternatively, let p be the probability that a
    randomly selected family has an income of less
    than 50,000. Then we may write (why?)
  • H0 p ? 0.5
  • Ha p lt 0.5

26
  • We choose a 0.05.
  • Test statistic ks 31 for a sample size of n
    75.
  • P-value. Assuming H0, we must find the chance of
    obtaining a test statistic at least this extreme.
    For this problem, that means (why?)

In Excel BINOMDIST(31, 75, 0.5, 1)
27
  • Conclusion We fail to reject the null
    hypothesis. There is not enough evidence to think
    that the median family income is more than
    50,000.
  • Notice why the phrase fail to reject is
    important. With a larger sample, its conceivable
    that the null hypothesis would then be rejected.

28
Conceptual Questions
  • 1) True or False
  • a) The observed significance level of 8 depends
    on the data (i.e. sample)
  • b) There are 92 chances out of 100 for the
    alternative hypothesis to be correct.

29
Conceptual Questions
  • 2) True or False
  •  
  • a) A highly statistically significant result
    cannot possibly be due to chance.
  • b) If a sample difference is highly
    statistically significant, there is less than a
    1 chance for the null hypothesis to be correct.

30
Conceptual Questions
  • 3) True or False
  •  
  • a) If P 43, then the null hypothesis looks
    plausible.
  • b) If P 0.43, then the null hypothesis looks
    implausible.

31
Binomial Exact Test
32
  • Example A die is rolled 180 times it lands six
    45 times. Is this evidence statistically
    significant enough to conclude that the die is
    not fairly balanced?
  • Solution.
  • H0
  • Ha
  • We choose a 0.05.
  • The test statistic is ks 45 for a sample
    size of n 180.

33
  • P-value. Assuming H0, we must find the chance of
    obtaining a test statistic at least this extreme.
    For this problem, that means that (why?)
  • Excel
  • BINOMDIST(15,180,1/6,1)1 - BINOMDIST(44,180,1/6,
    1)
  • Conclusion

34
  • Example There is a social theory that states
    that people tend to postpone their deaths until
    after some meaningful event birthdays,
    anniversaries, the World Series.
  • In 1978, social scientists investigated
    obituaries that appeared in a Salt Lake City
    newspaper. Among the 747 obituaries examined, 60
    of the deaths occurred in the three-month period
    preceding their birth month. However, if the day
    of death is independent of birthday, we would
    expect that 25 of these deaths (about 187) would
    occur in this three-month period.
  • Does this study provide statistically significant
    evidence to support this theory?

35
(No Transcript)
36
  • Example The following table summarizes the
    findings of a 1971 observational study of 5466
    women who gave birth, categorized by both smoking
    preference and low birthweight
  • Low birthweight Normal Total
  • Smokers 185 1891 2076
  • Nonsmokers 193 3197 3390
  • Total 378 5088 5466
  • Does this show that smoking is associated with
    low birthweight? (Notice we dont say causes
    since this is not a randomized, controlled,
    double-blind experiment.)

37
  • Example The following table summarizes the
    findings of a 1971 observational study of 5466
    women who gave birth, categorized by both smoking
    preference and low birthweight
  • Low birthweight Normal Total
  • Smokers 185 1891 2076
  • Nonsmokers 193 3197 3390
  • Total 378 5088 5466
  • Method of Attack Suppose that smoking and low
    birthweight are not associated. Then we would
    expect the proportion of smoking mothers among
    the 378 low birthweight babies to be roughly the
    same as the proportion of smoking mothers of all
    5466.

38
  • Solution.
  • H0
  • Ha
  • We choose a 0.05.
  • The test statistic is ks 185 for a sample
    size of n 378 (roughly 49).

39
  • P-value. Assuming H0, we must find the chance of
    obtaining a test statistic at least this extreme.
    For this problem, that means
  • Conclusion

40
Data Call Into Question HIV Study Results By
Gautam Naik and Mark Schoofs The Wall Street
Journal, October 10, 2009 Researchers from
the U.S. Army and Thailand announced last month
they had found the first vaccine that provided
some protection against HIV. But a second
analysis of the 105 million study suggests the
results may have been a fluke The second
analysis, which is considered a vital component
of any vaccine study, shows the results weren't
statistically significant, these scientists
said. In other words, it indicates that the
results could have been due to chance and that
the vaccine may not be effective The
incomplete disclosure raises the question of
whether the Army, the Thai government and the
U.S. National Institutes of Health -- which
helped fund the study -- rushed to give a
positive spin to what may turn out to be another
inconclusive AIDS-vaccine effort.
41
The first analysis announced last month
were based on a "modified-intent-to-treat
analysis," which includes virtually everyone who
enrolled in the study, regardless of whether they
ended up getting the full course of the vaccine.
It is a good stand-in for the real world, where
people don't always follow instructions properly.
By this measure, the vaccine tested in Thailand
reduced by 31 the chance of infection with HIV.

42
But the result was derived from a small number of
actual HIV cases New infections occurred in 51
of the 8,197 people who got the vaccine, compared
with 74 of the 8,198 volunteers who got placebo
shots. NB 51 is about a 31 reduction from
74. Statistical calculations showed there was
a 3.9 probability that chance accounted for the
difference. In drug and vaccine trials, anything
above a 5 probability of a chance result is
deemed statistically insignificant.
43
Infections No infections Total Treatment
51
8146 8197 Control
74 8124
8198 Total
125 16,270
16,395
44
(No Transcript)
45
The second analysis is called "per protocol"
and adheres strictly to how the trial was
designed by only including the study participants
who got the full regimen of vaccine shots at the
right time. Because it excludes study
participants who didn't get the full vaccine
regimen, it usually provides corroboration to the
looser "intent to treat" findings. Two AIDS
scientists, who have seen the "per protocol"
analysis, said it indicates there is a 16
chance the study results were a fluke -- a far
greater probability than is considered
statistically acceptable. This analysis included
86 people who received either the vaccine or a
placebo and were infected. The "per protocol"
analysis also showed that the supposed
effectiveness was lower, at 26.2. Dr. Kim, of
the U.S. Army, declined to comment on the data.
It isn't clear why the vaccine was seemingly
ineffective among participants who followed the
guidelines to the letter.
Write a Comment
User Comments (0)
About PowerShow.com