Title: f
1f
Nonparametric Methods
Week 13a
29. Wilcoxon test
- W test is used in the same circumstances as the
sign test (one sample two related samples). - One of the main advantages of sign test is that
the assumptions for its use are minimal. However,
the sign test considers only whether each
observation is above () or below (-) the median.
It does not keep track of how far above or below
median, each data value is. Consequently, a large
amount of pertinent information is potentially
ignored. - In addition to the sign, W test takes into
consideration the magnitude of the difference.
This feature results in a more powerful test (see
Table 1). - However, there is an assumption for the W test
that is not required for the sign test - population being sampled has a symmetric
distribution.
3Example 3
- Assuming that that populations are approximately
symmetric but not normal, test whether there is
a significant difference between the number of
cigarettes smoked before and after pregnancy. - Table 2 Number of Cigarettes Usually Smoked per
Day - Before (B) and After (A) Pregnancy
W 7
4- Null and alternative hypotheses are the same as
for the sign test - Ho MA MB. Ha MB ? MB.
- The test statistic is calculated as follows
- 1. Delete any observation equal to 0 before you
start. - 2. Arrange the observations from lowest to
highest. - disregarding the signs (take absolute
values). - 3. Rank the absolute differences in order
from lowest (1) - to highest (in our example 10).
- 4. Assign the sign of the difference to the
ranks. - 5. Test statistic W is the sum of the ranks
of the positive - differences.
- The idea behind the W test is that if the null
hypothesis is true then the sum of ranks of the
positive differences should be about the same as
the sum of the ranks of the negative difference
(48 in our example).
5Sign Test for Median Sign test of median
0.00000 versus not 0.00000
N Below Equal Above P
Median Difference 11 3 1
7 0.3437 5.000
Paired T-Test and Confidence Interval Paired T
for Before - After N
Mean StDev SE Mean Before 11
14.64 6.27 1.89 After 11
9.36 7.57 2.28 Difference 11
5.27 7.54 2.27 95 CI for mean
difference (0.21, 10.34) T-Test of mean
difference 0 (vs not 0) T-Value 2.32
P-Value 0.043
6Wilcoxon Signed Rank Test Test of median
0.000000 versus median not 0.000000
N for Wilcoxon
Estimated N Test
Statistic P Median Difference
11 10 7 0.041 4.500
- This result indicates that the difference between
the observed and expected rank sums is
significant (p lt 0.05). - Thus, it leads us to reject H0.
- Can we conclude that there is a significant
reduction in the smoking habit consequent to
pregnancy? - RECOMMENDATION If you are sure that the
population is normal use the t test. However, if
you believe that the underlying population is
symmetric (but not normal) use W test.
710. The Mann Whitney Wilcoxon test (MWW)
- MWW test is a nonparametric alternative to the
unrelated (unpaired, pooled) t test for analysing
data from a different subjects design with two
groups. - Assumptions Unlike the parametric t test, this
nonparametric makes no assumptions about the form
of the sampled distribution. The only assumption
is that the population has a continuous
distribution. - We will state the hypotheses in terms of equality
versus inequality of two population medians - Ho M1 M2
- Ha M1 ? M2
(two-sided alternative)
810. The Mann Whitney Wilcoxon test (MWW)
- Conceptual Basis
- A real difference between two treatments should
make observations in one sample generally larger
than those in the other. - If the null hypothesis is false, then when we
combine the two samples together and rank all the
combined observations, the data from one sample
should be concentrated at one end of the scale,
and the other samples data should be at the
other end. - However, if the hull hypothesis is true, then
data from both samples should be randomly
scattered throughout the ranking of the pooled
data.
9Example 4
- The data in the following table depict sulfate
concentrations in rainwater that were measured at
two different locations. - Location A
- 2.4 11.6 11.9 12 12 12.1 12.2 12.2 12.4 14
14.7 14.8 - Location B
- 10.1 10.4 10.5 10.6 10.8 10.9 11 11.1 11.4
11.5 13.9 25.1 -
- Using the provided normal plots test the
hypothesis that there is no significant
difference between two locations.
10p value 0 in both cases
11Example 4
State the null and the alternative hypotheses
and nominate the significance level, ?
STEP 1
- Ho M1 M2 (there is no difference between 2
locations) - Ha M1 ? M2
12Example 4
- The test statistic is calculated by
- joining the two samples into one and arranging
them into increasing order. For our data we have - 2.4 10.1 10.4 10.5 10.6 10.8 10.9 11 11.1 11.4
11.5 11.6 11.9 12 12 12.1 12.2 12.2 12.4 13.9 14
14.7 14.8 25.1 - Assigning ranks to all the observations in the
combined sample (giving average ranks for ties).
In our example - 1 2 3 4 5 6 7 8 9 10 11 12 13 14.5 14.5 16 17.5
17.5 19 20 21 22 23 24 - If you saw only these ranks, what would you guess
about the null hypothesis? - WWW test simply quantifies this pattern. The test
statistic is the sum of ranks of the data in the
first sample (191).
13Two Sample T-Test and Confidence Interval Two
sample T for A vs B N Mean StDev
SE Mean A 12 11.86 3.18 0.92 B 12
12.27 4.15 1.2 95 CI for mu A -
mu B ( -3.57, 2.7) T-Test mu A mu B (vs not
) T -0.28 P 0.79 DF 20
14Mann-Whitney Confidence Interval and Test A
N 12 Median 12.150 B N
12 Median 10.950 Point estimate for
ETA1-ETA2 is 1.200 95.4 Percent CI for
ETA1-ETA2 is (0.499,1.901) W 191.0 Test of
ETA1 ETA2 vs ETA1 not ETA2 is significant
at 0.0194 The test is significant at 0.0193
(adjusted for ties)
Test statistic
P-Value
- The terms ETA1 and ETA2 represent the medians of
two groups. What is the conclusion? - RECOMMENDATION If you are sure that the
population is normal use the t test. However, if
you believe that the underlying population is not
normal, particularly if you have outliers, use
WWW test. It is much less sensitive to outliers.