Using Statistics To Make Inferences 6 - PowerPoint PPT Presentation

1 / 66

About This Presentation

Title:

Using Statistics To Make Inferences 6

Description:

Wilcoxon Signed Ranks Test. Wilcoxon Matched Pairs Signed Ranks Test. Wilcoxon Rank ... MTB WTest 0.25 C1; SUBC Alternative 0. Wilcoxon Signed Rank Test: C1 ... – PowerPoint PPT presentation

Number of Views:270

Avg rating:3.0/5.0

Slides: 67

Provided by: micha558

Category:

more less

Transcript and Presenter's Notes

Title: Using Statistics To Make Inferences 6

1
Using Statistics To Make Inferences 6

Summary
Non-parametric tests.
Wilcoxon Signed Ranks Test.
Wilcoxon Matched Pairs Signed Ranks Test.
Wilcoxon Rank Sum Test /
Mann-Whitney Test.

1
2
Goals

Perform and interpret Wilcoxon Signed Ranks
Test.
Perform and interpret Wilcoxon Matched Pairs
Signed Ranks Test.
Perform and interpret Wilcoxon Rank Sum Test /
Mann-Whitney Test.
Know when each test is appropriate.

2
3
Practical

Perform a series of Mann-Whitney tests and
compare the results to those obtained from
t-tests.

3
4
Recall

Which tests might be appropriate when testing the
mean of normal data?
What criteria would you employ to select the
appropriate test?

z or t
For z need s, for t calculate s
4
5
Today

What if we cannot assume normality?
We shall develop a more powerful test than the
sign test for the median (see lecture 2).

5
6
Non-parametric Tests

A single sample test
Wilcoxon Signed Ranks Test

6
7
Wilcoxon Signed Ranks Test Procedure

Take the difference between each observation and
the median (? - eta).

Rank the absolute differences from 1 to n,
allowing for ties. (1 smallest, n largest)

Sum the rank values for those observations above
?, let this be W

Sum the rank values for those observations below
?, let this be W-

Use the smaller of W or W- to use as the test
statistic. Call this Wcalc (or simply W).
Critical values of the test statistic are given
in tables for various significance levels.

7
8
Wilcoxon Signed Ranks Test Notes

If two or more differences are equal (tied) they
are assigned the average of the ranks.

If a difference is zero, it is omitted.
8
9
Example

It has been established that an individuals
median reaction time is 0.250 seconds. Twelve
trials are conducted after the individual has
consumed alcohol. The measured times are

0.235 0.252 0.312 0.264 0.323 0.241 0.284 0.306
0.248 0.284 0.298 0.320
Test whether the data are consistent with the
median value.
9
10
Step 1

The first step is to subtract the median (0.25)
from every value.

10
11
Step 1
Median 0.250 0.235 0.250 -0.015
11
12
Step 2

Now calculate and rank on the absolute
differences

12
13
Step 2
Now rank the data
13
14
Step 2
Is the ranking unambiguous?
14
15
Step 2
ltlt Tied Values, so average ranks
15
16
Step 2
16
17
Step 3/4

Now separate the contributions for positive
(negative) differences

17
18
Step 3/4
Now sum the contributions for positive and
negative differences
18
19
Step 3/4
19
20
Step 5
W68.5 W-9.5
As a cross check Note that WW- ½ n (n1) Here
n 12 and ½ n (n1) ½ 12 (121) 78 As
expected
20
21
Step 5
W68.5 W-9.5
Therefore the result is significant at the 5
level (14 gt 9.5), so the null hypothesis can be
rejected. The median is apparently not consistent
with 0.250 seconds.
21
22
Normal approximation

W 9.5 n 12
Employing the normal approximation,

In this case the continuity correction is added
since a lower tail is being considered.
22
23
Normal approximation

Z -2.27

The p value is 2 x 0.0115 0.02 remarkably
close to the exact value
23
24
Wilcoxon Matched Pairs Signed Ranks Test -
Example

Certain mental tasks are performed before and
after exercise. The scores were recorded.

Is there any evidence of a significant difference
in the levels of performance under the two
conditions?
Still effectively a single sample since we seek a
change.
24
25
Find Differences

Find differences
25
26
Find Absolute Differences

Find absolute differences
26
27
Find Absolute Differences

Now rank the absolute differences
27
28
Rank the Absolute Differences
Now deal with ties
28
29
Rank the Absolute Differences
Now find the true ranks.
29
30
Rank the Absolute Differences
Now separate positive and negative values
30
31
Separate the Contributions for Positive
(Negative) Differences
Now form the totals
31
32
Separate the Contributions for Positive
(Negative) Differences
32
33
Conclusion
W50 W-5
As a cross check Note that WW- ½ n (n1) Here
n 10 and ½ n (n1) ½ 10 (101) 55 As
expected
33
34
Conclusion
W50 W-5
Therefore the result is significant at the 5
level (8 gt 5), so the null hypothesis can be
rejected. The scores appear to differ.
34
35
Aside

It is claimed that it takes 15 minutes to mark
an examination question.
Test this claim using the following timings (x)
for marking 10 questions.
12 15 14 16 12 15 14 16 11 15
You might find the following sums useful
Sx 140 and Sx2 1988

35
36
Aside

What are we testing?
Which test is appropriate?

µ the population mean
a
t, we dont know s
36
37
Aside
n 10 Sx 140 Sx2 1988
mean
variance
37
38
Aside
n 10 Sx 140 Sx2 1988
C C C C C C C C C C c
38
39
Aside
n 10 µ 15
? n 1 9
Highlighting the values for 95 and 99
confidence for a two tail test.
39
40
Aside

Since tcalc lt tcrit (1.79 lt 2.262) the
experiment is consistent with a population mean
of 15 at 95 confidence.
This is supported by software which has a 95
confidence interval that includes 15 and a
p-value of 0.107 (gt0.05).

C C C C C C C C C C C c
40
41
Aside
C C C C C C C C C C C C c
t value p-value Confidence interval
Confidence interval (15-2.26,15.26)
(12.74,15.26)
41
42
Now for Two Samples

For normal data we have the two sample t test of
lecture 4.
What if data is not normal?

42
43
Wilcoxon Rank Sum Test / Mann-Whitney Test

A two sample test

Combine the observations from the two samples
(sizes n1 and n2).

Rank the sorted data from 1 to (n1n2).

Calculate R1, as the sum of the ranks of the
first sample and R2 for the second.

Form the Mann-Whitney test statistic.

43
44
Normal Approximation

If n1 and n2 are greater than 8 a normal
approximation may be employed where

In this case the continuity correction is added
since a lower tail is being considered. The
approximation is particularly relevant if the
tables are not extensive enough.
44
45
Wilcoxon Rank Sum Test Example

A study of 22 patients suffering from Parkinsons
disease was conducted. An operation was performed
on 8 of them, while it improved their general
condition it might adversely affect their speech.
In the data a higher value indicates a greater
difficulty in speaking.

45
46
Wilcoxon Rank Sum Test Data
Operated 2.6 2.0 1.7 2.7 2.5 2.6 2.5
3.0 Others 1.2 1.8 1.9 2.3 1.3 3.0 2.2 1.3 1.5
1.6 1.3 1.5 2.7 2.0
46
47
Ranked Data
Speech Source Rank 1.2 Others
1 1.3 Others 2 1.3 Others
3 1.3 Others 4 1.5 Others
5 1.5 Others 6
1.6 Others 7 1.7 Operated 8
1.8 Others 9 1.9 Others
10 2.0 Operated 11 2.0 Others
12 2.2 Others 13
2.3 Others 14 2.5 Operated 15
2.5 Operated 16 2.6 Operated 17
2.6 Operated 18 2.7 Operated 19
2.7 Others 20
3.0 Operated 21 3.0 Others 22

Note the ties
47
48
Ranked Data
Speech Source Rank 1.2 Others
1 1.3 Others 2
1.3 Others 3 1.3 Others 4
1.5 Others 5
1.5 Others 6 1.6 Others 7
1.7 Operated 8 1.8 Others
9 1.9 Others 10
2.0 Operated 11 2.0 Others 12
2.2 Others 13 2.3 Others 14
2.5 Operated 15 2.5 Operated 16
2.6 Operated 17 2.6 Operated 18
2.7 Operated 19 2.7 Others 20
3.0 Operated 21 3.0 Others
22
Now find the true ranks
48
49
Ranked Data
Speech Source Rank True Rank 1.2 Others
1 1 1.3 Others 2 3
1.3 Others 3 3 1.3 Others
4 3 1.5 Others 5 5.5
1.5 Others 6 5.5 1.6 Others
7 7 1.7 Operated 8 8
1.8 Others 9 9 1.9 Others 10
10 2.0 Operated 11 11.5
2.0 Others 12 11.5 2.2 Others 13
13 2.3 Others 14 14
2.5 Operated 15 15.5 2.5 Operated 16
15.5 2.6 Operated 17 17.5
2.6 Operated 18 17.5 2.7 Operated 19
19.5 2.7 Others 20 19.5
3.0 Operated 21 21.5 3.0 Others 22
21.5
Now separate the contributions for each source.
49
50
Ranked Data
Speech Source Rank True Rank Others Operated
1.2 Others 1 1 1
1.3 Others 2 3 3
1.3 Others 3 3 3
1.3 Others 4 3 3
1.5 Others 5 5.5 5.5
1.5 Others 6 5.5 5.5
1.6 Others 7 7 7
1.7 Operated 8 8 8
1.8 Others 9 9 9
1.9 Others 10 10 10
2.0 Operated 11 11.5 11.5
2.0 Others 12 11.5 11.5
2.2 Others 13 13 13
2.3 Others 14 14 14
2.5 Operated 15 15.5 15.5
2.5 Operated 16 15.5 15.5
2.6 Operated 17 17.5 17.5
2.6 Operated 18 17.5 17.5
2.7 Operated 19 19.5 19.5
2.7 Others 20 19.5 19.5
3.0 Operated 21 21.5 21.5
3.0 Others 22 21.5 21.5
Now sum the contributions for each source
50
51
Ranked Data
Speech Source Rank True Rank Others Operated
1.2 Others 1 1 1
1.3 Others 2 3 3
1.3 Others 3 3 3
1.3 Others 4 3 3
1.5 Others 5 5.5 5.5
1.5 Others 6 5.5 5.5
1.6 Others 7 7 7
1.7 Operated 8 8 8
1.8 Others 9 9 9
1.9 Others 10 10 10
2.0 Operated 11 11.5 11.5
2.0 Others 12 11.5 11.5
2.2 Others 13 13 13
2.3 Others 14 14 14
2.5 Operated 15 15.5 15.5
2.5 Operated 16 15.5 15.5
2.6 Operated 17 17.5 17.5
2.6 Operated 18 17.5 17.5
2.7 Operated 19 19.5 19.5
2.7 Others 20 19.5 19.5
3.0 Operated 21 21.5 21.5
3.0 Others 22 21.5 21.5
Total 126.5 126.5
It is pure chance that the two sums are equal
51
52
Calculations
R1126.5, R2126.5, n1 8, n2 14
Note, only need to evaluate one.
52
53
Conclusion
U1 21.5 U2 90.5
(mid-point ½ n1 n2 56 so only need calculate
one)
For n18, n214, the critical value from the
tables for p0.05 is 26. The result is
significant at the 5 level (26 gt 21.5), the two
samples appear to differ.
53
54
Normal Approximation

Employing the normal approximation for U1 21.5

The p value is 2 x 0.0102 0.02 remarkably
close to the exact value
54
55
SPSS Verification

Note that the groups must have numerical
identifiers
1 operated 2 others.
Analyze gt Nonparametric tests gt Legacy Dialogs gt
2 independent samples

55
56
SPSS Verification
Note the need to define group variables (1/2)
56
57
SPSS Verification
The previous rank sums are reproduced.
57
58
SPSS Verification
The previous U and Z values are reproduced.
The p value is less than .05
58
59
Parametric vs Non-Parametric Tests

Parametric Tests
They are robust with respect to violations of
their assumptions.
They are more powerful - more likely to detect
an effect when one is present.
They are more versatile there are tests for
every experimental design.

59
60
Parametric vs Non-Parametric Tests

Non-Parametric Tests
They make fewer assumptions.
They are ideal for ordinal data, which is common
in Psychology, where as parametric tests require
interval or ratio data.

60
61
Light Back Ground Reading
Last year, the BBC ran a six-part primer by
Michael Blastland on understanding statistics in
the news. Blastland takes on the medias handling
of surveys/polls, counting, percentages,
averages, causation and doubt. Wouldn't it be
good, Blastland said, to have the mental
agility to separate the wheat from the chaff? He
then proceeds, in six weekly articles, to point
out the obvious vs. the correct ways to interpret
the data. Topics covered are Surveys Counting Perc
entages Averages Causation Doubt
A Statistical Primer from the BBC
61
62
Read

Read Howitt and Cramer pages 168-177
Read Howitt and Cramer (e-text) pages 154-164,
167-173
Read Russo (e-text) pages 168-175
Read Davis and Smith pages 448-459

62
63
General Reading

The Mann-Whitney U A Test for Assessing Whether
Two Independent Samples Come
from the Same Distribution
N. Nachar
Tutorials in Quantitative Methods for Psychology
2008, vol. 4(1), p. 13-20.
Paper

63
64
Practical 6

This material is available from the module web
page.
http//www.staff.ncl.ac.uk/mike.cox

Module Web Page
64
65
Practical 6

This material for the practical is available.

Instructions for the practical Practical 6
Material for the practical Practical 6
65
66
Whoops!
Be under no illusion the 139.50 for colour
licence (47 for black and white) will be
infinitely more affordable than the maximum,
1,000 fine for avoidance. The Guardian 4
November 2008
66

Write a Comment

User Comments (0)