Title: Test to See if Samples Come From Same Population
1Lesson 15 - 7
- Test to See if Samples Come From Same Population
2Objectives
- Test a claim using the KruskalWallis test
3Vocabulary
- KruskalWallis Test -- nonparametric procedure
used to test the claim that k (3 or more)
independent samples come from populations with
the same distribution.
4Test of Means of 3 or more groups
- Parametric test of the means of three or more
groups - Compared the corresponding observations by
subtracting one mean from the other - Performed a test of whether the mean is 0
- Nonparametric case for three or more groups
- Combine all of the samples and rank this combined
set of data - Compare the rankings for the different groups
5Kruskal-Wallis Test
- Assumptions
- Samples are simple random samples from three or
more populations - Data can be ranked
- We would expect that the values of the samples,
when combined into one large dataset, would be
interspersed with each other - Thus we expect that the average relative ratings
of each sample to be about the same
6Test Statistic for KruskalWallis Test
A computational formula for
the test statistic is where Ri is the sum
of the ranks of the ith sample R²1 is the sum
of the ranks squared for the first sample R²2
is the sum of the ranks squared for the second
sample, and so on n1 is the number of
observations in the first sample n2 is the
number of observations in the second sample, and
so on N is the total number of observations (N
n1 n2 nk) k is the number of
populations being compared.
7Test Statistic (cont)
- Large values of the test statistic H indicate
that the Ris are different than expected - If H is too large, then we reject the null
hypothesis that the distributions are the same - This always is a right-tailed test
8Critical Value for KruskalWallis Test
Small-Sample Case When three populations are
being compared and when the sample size from each
population is 5 or less, the critical value is
obtained from Table XIV in Appendix
A. Large-Sample Case When four or more
populations are being compared or the sample size
from one population is more than 5, the critical
value is ?²a with k 1 degrees of freedom, where
k is the number of populations and a is the level
of significance.
9Hypothesis Tests Using KruskalWallis Test
Step 0 Requirements 1. The samples are
independent random samples. 2. The data can be
ranked. Step 1 Box Plots Draw side-by-side
boxplots to compare the sample data from the
populations. Doing so helps to visualize the
differences, if any, between the medians. Step 2
Hypotheses (claim is made regarding distribution
of three or more populations) H0 the
distributions of the populations are the same
H1 the distributions of the populations are not
the same Step 3 Ranks Rank all sample
observations from smallest to largest. Handle
ties by finding the mean of the ranks for tied
values. Find the sum of the ranks for each
sample. Step 4 Level of Significance (level of
significance determines the critical value)
The critical value is found from Table XIV for
small samples. The critical value is ?²a with
k 1 degrees of freedom (found in Table VI) for
large samples.
Step 5 Compute Test Statistic
Step 6 Critical Value Comparison We
reject the null hypothesis if the test statistic
is greater than the critical value.
10KruskalWallis Test Hypothesis
- In this test, the hypotheses are
- H0 The distributions of all of the
populations are the same - H1 The distributions of all of the
populations are not the same - This is a stronger hypothesis than in ANOVA,
where only the means (and not the entire
distributions) are compared
11Example 1 from 15.7
S 20-29 40-49 60-69
1 54 (29) 61 (31.5) 44 (18)
2 43 (16) 41 (14) 65 (34.5)
3 38 (11.5) 44 (18) 62 (33)
4 30 (2) 47 (21) 53 (27.5)
5 61 (31.5) 33 (3) 51 (26)
6 53 (27.5) 29 (1) 49 (22.5)
7 35 (7.5) 59 (30) 49 (22.5)
8 34 (4.5) 35 (7.5) 42 (15)
9 39 (13) 34 (4.5) 35 (7.5)
10 46 (20) 74 (36) 44 (18)
11 50 (24.5) 50 (24.5) 37 (10)
12 35 (7.5) 65 (34.5) 38 (11.5)
Medians (Sums) 41 (194.5) 45.5(225.5) 46.5(246)
12Example 1 (cont)
Critical Value (Large-Sample Case) ?²a with 2
(3 1) degrees of freedom, where 3 is the number
of populations and 0.05 is the level of
significance CV 5.991
Conclusion Since H lt CV, therefore we FTR H0
(distributions are the same)
13Summary and Homework
- Summary
- The Kruskal-Wallis test is a nonparametric test
for comparing the distributions of three or more
populations - This test is a comparison of the rank sums of the
populations - Critical values for small samples are given in
tables - The critical values for large samples can be
approximated by a calculation with the chi-square
distribution - Homework
- problems 3, 6, 7, 10 from the CD