Nonparametric Inference - PowerPoint PPT Presentation

About This Presentation
Title:

Nonparametric Inference

Description:

Nonparametric Inference Example: Wilcoxon Signed Rank Test We conclude that individuals with cystic fibrosis (CF) have a large resting energy expenditure when ... – PowerPoint PPT presentation

Number of Views:138
Avg rating:3.0/5.0
Slides: 55
Provided by: wsu45
Category:

less

Transcript and Presenter's Notes

Title: Nonparametric Inference


1
Nonparametric Inference
2
Why Nonparametric Tests?
  • We have been primarily discussing parametric
    tests i.e. , tests that hold certain assumptions
    about when they are valid, e.g. t-tests and ANOVA
    both had assumptions regarding the shape of the
    distribution (normality) and about the necessity
    of having similar groups (homogeneity of
    variance).
  • When these assumptions hold we can use standard
    sampling distributions (e.g. t-distribution,
    F-distribution) to find p-values.

3
Why Nonparametric Tests?
  • When these assumptions are violated it is
    necessary to turn to tests that do not have such
    stringent assumptions nonparametric or
    "distribution-free" tests.
  • Specifically, there are three cases which
    necessitate the use of non-parametric tests

1) The data for the response is not at least
interval scale, i.e. measurements. For
example the response might be ordinal.
3) There exists severely unequal variances
between groups, i.e. there is obviously a
violation of the homogeneity of variance
assumption required for parametric tests.
In the last two cases, we have interval level
data, but it violates our parametric assumptions.
Therefore, we no longer treat this data as
interval, but as ordinal. In a sense, we demote
it because it fails to meet specific assumptions.
2) The distribution of the data for the response
is not normal. Recall that a
relatively normal distribution is assumed
for parametric tests.
4
Table of Parametric Nonparametric Tests
Parametric Test Nonparametric Test Purpose of Test
Two-Sample t-Test (either case) Mann-Whitney/ Wilcoxon Rank Sum Test Compare two independent samples
Paired t-Test Sign Test or Wilcoxon Signed-Rank Test Compare dependent samples
Oneway ANOVA Kruskal-Wallis Test Compare k-independent samples
5
Independent Samples
  • For two populations we use
  • Mann-Whitney/Wilcoxon Rank Sum Test
  • For three or more populations we use
  • Kruskal-Wallis Test (at the end)

6
Mann-Whitney/Wilcoxon Rank Sum Test
  • Alternative to two-sample t-Test
  • Use when
  • - populations being sampled are not normally
    distributed.
  • - sample sizes are small so assessing
    normality is not possible (ni lt 20).
  • - response is ordinal

7
Mann-Whitney/Wilcoxon Rank Sum Test
  • General Hypotheses
  • Ho distribution of pop. A and pop. B are the
    same, i.e. A B
  • HA distribution of pop. A and pop. B are NOT
    the same, i.e A B
  • HA distribution of pop. A is shifted to the
    right of pop. B, i.e. A gt B.
  • HA distribution of pop. A is shifted to the left
    of pop. B, i.e. A lt B

8
Mann-Whitney/Wilcoxon Rank Sum Test
  • Ho A B vs.
    HA A gt B

Q Is there evidence that the values in
population A are generally larger than those in
population B?
9
Mann-Whitney/Wilcoxon Rank Sum Test(Test
Procedure)
  • Rank all N nA nB observations in the
    combined sample from both populations in
    ascending order.
  • Sum the ranks of the observations from
    populations A and B separately and denote the
    sums wA and wB. Assign average rank to tied
    observations.
  • For HA A lt B reject Ho if wA is small or wB is
    big.For HA A gt B reject Ho if wA is big or
    wB is small.
  • Use tables to determine how big or small the
    rank sums must be in order to reject Ho or use
    software to conduct the test.

10
Mann-Whitney/Wilcoxon Rank Sum Test(Critical
Value Table)
This table contains the value the smaller rank
sum must be less than in order to reject the Ho
for a one-tailed test situation for two
significance levels (a .05 .01) Tables exist
for the two-tailed tests as well.
n is the sample size of the group with the
smaller rank sum.
11
Example Huntingtons Disease and
Fasting Glucose Levels
  • Davidson et al. studied the responses to oral
    glucose in patients with Huntingtons disease and
    in a group of control subjects. The five-hour
    responses are shown below. Is there evidence to
    suggest the five-hour glucose (mg present) is
    greater for patients with Huntingtons disease?

Ho Control Huntingtons i.e. C H HA
Control lt Huntingtons i.e. C lt H
12
Example Observations Ranks
Control Group (nA 10) Huntingtons Disease (nB 11)
83 85
73 89
65 86
65 91
90 77
77 93
78 100
97 82
85 92
75 86
86
10.5
9
15
3
13
1.5
17
1.5
5.5
16
5.5
19
7
21
8
20
18
10.5
4
13
13
wA 78
wB 153
13
Example Critical Value Table
Here, nC 10 (control) nH 11 (Huntingtons) we
will reject Ho C H in favor of HA C lt H if
the rank sum for the control group is less than
86 at a .05 level and less than 77 at a .01
level.
14
Example Decision/Conclusion
  • Using the Wilcoxon Rank Sum Test we have
    evidence to suggest that the five hour glucose
    level for individuals with Huntingtons disease
    is greater than that for healthy controls (p lt
    .05).
  • Note p lt .05 because the observed rank sum for
    the control group is less than 86 which is
    the critical value for a .05.

15
Rank Sum Test in JMP
The p-values reported based upon large sample
approximations which generally should not be used
when sample sizes are small. Here the conclusion
reached is the same but in general we should use
tables if they are available.
16
Rank Sum Test in SPSS
Exact one-tailed p-value .024/2 .012
17
Dependent Samples
  • Sign Test
  • Wilcoxon Signed-Rank Test

18
Sign Test
  • The sign test can be used in place of the paired
    t-test when we have evidence that the paired
    differences are NOT normally distributed.
  • It can be used when the response is ordinal.
  • Best used when the response is difficult to
    quantify and only improvement can be measured,
    i.e. subject got better, got worse, or no change.
  • Magnitude of the paired difference is lost when
    using this test.

19
Sign Test
  • The sign test looks at the number of () and (-)
    differences amongst the nonzero paired
    differences.
  • A preponderance of s or s can indicate that
    some type of change has occurred.
  • If the null hypothesis of no change is true we
    expect s and s to be equally likely to occur,
    i.e. P() P(-) .50 and the number of each
    observed follows a binomial distribution.

20
Example Sign Test
  • A study evaluated hepatic arterial infusion of
    floxuridine and cisplatin for the treatment of
    liver metastases of colorectral cancer.
  • Performance scores for 29 patients was recorded
    before and after infusion. Is there evidence
    that patients had a better performance score
    after infusion?

21
Example Sign Test
Patient Before (B) Infusion After (A) Infusion Difference (A B) Patient Before (B) Infusion After (A) Infusion Difference (A B)
1 2 1 -1 16 0 0 0
2 0 0 0 17 0 3 3
3 0 0 0 18 2 3 1
4 1 0 -1 19 2 3 1
5 3 3 0 20 3 2 -1
6 1 0 -1 21 0 4 4
7 1 3 2 22 0 3 3
8 0 0 0 23 1 2 1
9 0 0 0 24 0 3 3
10 0 0 0 25 0 2 2
11 1 0 -1 26 1 1 0
12 1 1 0 27 3 3 0
13 2 1 -1 28 1 2 1
14 3 1 -2 29 0 2 2
15 0 0 0
22
Example Sign Test
  • Ho No change in performance score following
    infusion, or more specifically median
    change in performance score is 0.
  • HA Performance scores improve following
    infusion, or more specifically median
  • change in performance score gt 0.
  • Intuitively we will reject Ho if there is a
    large number of s.

23
Example Sign Test
17 nonzeros differences, 11 s 6 s
Patient Before (B) Infusion After (A) Infusion Difference (A B) Patient Before (B) Infusion After (A) Infusion Difference (A B)
1 2 1 -1 16 0 0 0
2 0 0 0 17 0 3 3
3 0 0 0 18 2 3 1
4 1 0 -1 19 2 3 1
5 3 3 0 20 3 2 -1
6 1 0 2 21 0 4 4
7 1 3 0 22 0 3 3
8 0 0 0 23 1 2 1
9 0 0 0 24 0 3 3
10 0 0 -1 25 0 2 2
11 1 0 0 26 1 1 0
12 1 1 -1 27 3 3 0
13 2 1 -2 28 1 2 1
14 3 1 0 29 0 2 2
15 0 0 0
-


-

-





-

-
-


24
Example Sign Test
  • If Ho is true, X the number of s has a
    binomial dist. with n 17 and p P() .50.
  • Therefore the p-value is simply the
  • P(X gt 11n17, p .50).166 gt a
  • We fail to reject Ho, there is insufficient
    evidence to conclude the performance score
    improves following infusion (p .166).

25
Wilcoxon Signed-Rank Test
  • The problem with the sign test is that the
    magnitude or size of the paired differences is
    lost.
  • The Wilcoxon Signed-Rank Test uses ranks of the
    paired differences to retain some sense of their
    size.
  • Use when the distribution of the paired
    differences are NOT normal or when sample size is
    small.
  • Can be used with an ordinal response.

26
Wilcoxon Signed Rank Test(Test Procedure)
  • Exclude any differences which are zero.
  • Put the rest of differences in ascending order
    ignoring their signs.
  • Assign them ranks.
  • If any differences are equal, average their ranks.

27
Example Wilcoxon Signed Rank Test
  • Resting Energy Expenditure (REE) for Patient
    with Cystic Fibrosis
  • A researcher believes that patients with cystic
    fibrosis (CF) expend greater energy during
    resting than those without CF. To obtain a fair
    comparison she matches 13 patients with CF to 13
    patients without CF on the basis of age, sex,
    height, and weight.

28
Example Wilcoxon Signed Rank Test
Pair CF (C) Healthy (H) Difference d C - H Sign of Difference Abs. Diff. d Rank d Signed Rank
1 1153 996 157 157 6
2 1132 1080 52 52 3
3 1165 1182 -17 - 17 2
4 1460 1452 8 8 1
5 1634 1162 472 472 13
6 1493 1619 -126 - 126 5
7 1358 1140 218 218 9
8 1453 1123 330 330 11
9 1185 1113 72 72 4
10 1824 1463 361 361 12
11 1793 1632 161 161 7
12 1930 1614 316 216 8
13 2075 1836 239 239 10
6
3
-2
1
13
-5
9
11
4
12
7
8
10
29
Example Wilcoxon Signed Rank Test
Pair CF (C) Healthy (H) Difference d C - H Signed Rank
1 1153 996 157 6
2 1132 1080 52 3
3 1165 1182 -17 -2
4 1460 1452 8 1
5 1634 1162 472 13
6 1493 1619 -126 - 5
7 1358 1140 218 9
8 1453 1123 330 11
9 1185 1113 72 4
10 1824 1463 361 12
11 1793 1632 161 7
12 1930 1614 316 8
13 2075 1836 239 10
We then calculate the sum of the positive ranks (
T ) and the sum of the negative ranks (T-
). Here we have T 6 3 1 13 9 11 4
12 7 8 10 84and T- 2 5 7
30
Wilcoxon Signed Rank Test(Test Statistic)
  • Intuitively we will reject the Ho ,which states
    that there is no difference between the
    populations, if either one of these rank sums is
    large and the other is small.
  • The Wilcoxon Signed Rank Test uses the smaller
    rank sum, T min( T ,T- ) , as the test
    statistic.

31
Example Wilcoxon Signed Rank Test
  • For the cystic fibrosis example we have the
    following hypotheses
  • Ho there is no difference in the resting energy
    expenditure of individuals with CF and healthy
    controls who are the same gender, age, height,
    and weight.
  • HA the resting energy expenditure of
    individuals with CF is greater than that of
    healthy individuals who are the same gender, age,
    height, and weight.

MEDIAN PAIRED DIFFERENCE 0
MEDIAN PAIRED DIFFERENCE gt 0
32
Example Wilcoxon Signed Rank Test
  • HA the resting energy expenditure of
    individuals with CF is greater than that of
    healthy individuals who are the same gender, age,
    height, and weight.
  • The alternative is clearly supported if T is
    large or T- is small.
  • The test statistic T min( T , T- ) 7
  • Is T 7 considered small, i.e. what is the
    corresponding p-value?
  • To answer this question we need a Wilcoxon Signed
    Rank Test table or statistical software.

33
Example Wilcoxon Signed Rank Test
This table gives the value of T min( T , T- )
that our observed value must be less than in
order to reject Ho for the both two- and
one-tailed tests. Here we have n 13 T 7.
We can see that our test statistic is less than
21 (a .05) and 12 (a .01) so we will reject
Ho and we also estimate that our p-value lt .01.
34
Example Wilcoxon Signed Rank Test
  • We conclude that individuals with cystic fibrosis
    (CF) have a large resting energy expenditure when
    compared to healthy individuals who are the same
    gender, age, height, and weight (p lt .01).

35
Analysis in JMP
The test statistic is reported as (T - T-)/2
(84 7)/2 38.50 but we only need p-value
.0023.
36
Analysis in SPSS
Click on CF first and then Healthy to specify
that the paired difference will be defined as CF
Healthy specify which tests to conduct.
Note the Difference column is not actually used
in the SPSS analysis.
37
Analysis in SPSS
For one-tailed Wilcoxon Signed Rank Test our
p-value .007/2 .0035 (not exact!) For
the Sign Test we have a one-tailed p-value
.022/2 .011
38
Independent Samples
  • If we have three or more populations to compare
    we use
  • Kruskal Wallis Test

39
Kruskal-Wallis Test
  • One-way ANOVA for a completely randomized design
    is based on the assumption of normality and
    equality of variance.
  • The nonparametric alternative not relying on
    these assumptions is called the Kruskal-Wallis
    Test.
  • Like the Mann-Whitney/Wilcoxon Rank Sum Test we
    use the sum of the ranks assigned to each group
    when considering the combined sample as the basis
    for our test statistic.

40
Kruskal-Wallis Test
  • Basic Idea
  • 1) Looking at all observations together, rank
    them.
  • 2) Let R1, R2, ,Rk be the sum of the ranks
    of each group
  • 3) If some Ris are much larger than others,
    it indicates the response values in different
    groups come from different populations.

41
Kruskal-Wallis Test
  • The test statistic is
  • where,
  • N total sample size n1 n2 ... nk

42
Kruskal-Wallis Test
  • The test statistic is
  • Under the null hypothesis, this has an
    approximate chi-square distribution with df k
    -1, i.e. .
  • The approximation is OK when each group contains
    at least 5 observations.
  • N total sample size n1 n2 ... nk

43
Chi-squared Distribution and p-value
Area p-value
44
Example Kruskal-Wallis Test
  • A clinical trial evaluating the fever reducing
    effects of aspirin, ibuprofen, and acetaminophen
    was conducted. Study subjects were adults seen
    in an ER with diagnoses of flu with body
    temperatures between 100o F and 100.9o F.
    Subjects were randomly assigned to treatment.
    Changes in body temperature were recorded 2 hrs.
    after administration of treatments.

45
Example Kruskal-Wallis Test
  • Resulting Data Temperature Decrease (deg. F)

Aspirin Rank Ibuprofen Rank Acetaminophen Rank
.95 .39 .19
1.48 .44 1.02
1.33 1.31 .07
1.28 2.48 .01
1.39 .62
-.39 (i.e. temp increase)
4
5
8
6
9
14
11
12
3
15
10
2
13
7
1
N 15 R1 44 R2 50
R3 26 n1 4
n2 5 n3 6
46
Example Kruskal-Wallis Test
N 15 R1 44 R2 50
R3 26 n1 4
n2 5 n3 6
47
Chi-squared Distribution and p-value
Area .033
48
Kruskal-Wallis in JMP (Demo)
Analyze gt Fit Y by X
RESULTS R1 44 n1 4 R2 50 n2 5 R3 26
n3 6 H 6.833 df 2 p .033
49
Kruskal-Wallis in SPSS (Demo)
RESULTS R1 /n1 11.00 R2 /n2 10.00 R3 /n3
4.33 H 6.833 df 2 p .033
50
Decision/Conclusion
  • Using the Kruskal-Wallis test have evidence to
    suggest that the temperature changes after taking
    the different drugs are not the same (p .033).
  • Now we might like to know which drugs
    significantly differ from one another.

51
Multiple Comparisons forKruskal Wallis Test
  • If we decide at least two populations differ in
    term of what is typical of their values we can
    use multiple comparisons to determine which
    populations differ.
  • To do this we calculate an approximate p-value
    for each pair-wise comparison and then compare
    that p-value to a Bonferroni corrected
    significance level (a).

52
Multiple Comparisons forKruskal Wallis Test

To determine if group i significantly differs
from group j we compute
.
and then compute p-value
and compare to a/2m where m is the number of
possible pair-wise comparisons, m
53
Multiple Comparisons forKruskal Wallis Test
  • Comparing Aspirin to Acetominophen

N 15 Aspirin Acetominophen R1
44 R3 26
n1 4 n3 6
Computing the Bonferroni corrected significance
level we have .05/2(3) .00833
54
Multiple Comparisons forKruskal Wallis Test
  • As this is not significant no others will either,
    so how can this be?
  • The problem is the Bonferroni correction is too
    conservative and the approximate normality of the
    multiple comparison is valid only when
    sample sizes are large and the sample sizes
    here quite small.
  • Thus the comparison shown is fine for a
    demonstration of the procedure but the results
    cannot be trusted.

55
Nonparametric Multiple Comparisons in JMP
56
Nonparametric Multiple Comparisons in JMP
57
Nonparametric Tests in R
Write a Comment
User Comments (0)
About PowerShow.com