Title: NonParametric Tests
1Chapter_13
- Non-Parametric Tests
- Field_2005
2What are non-parametric tests?
- They do not make any parametric assumptions about
the data such as normality, homogeneity of
variances, etc. - They are therefore also called 'assumption-free'
tests - They work on the principle of ranking data. The
lowest score receives the rank 1, the next
highest score the rank 2, etc., without implying
that the intervals between the ranks are equal.
Low scores will be represented as low ranks, high
scores as high ranks. The analysis is then
carried out on the ranks and not on the original
scores. - 4 tests will be considered here
- - Wilcoxon rank-sum test -Kruskal-Wallis test
- (Mann-Whitney test)
- - Wilcoxon signed-rank test - Friedman's test
-
3Outlook Terminology
4 Wilcoxon rank-sum test and Mann-Whitney Test
- With these two tests you can compare 2
independent conditions. - They are equivalent to an independent t-test
- Example The effect of Ecstasy vs. Alcohol shall
be measured, using the Beck Depression Inventory
(BDI).
5The data effect of Ecstasy vs. Alcohol
Depression scores were obtained one day after
taking the drug (Sunday)? and three days later
(Wednesday)? to find out if there is a
development of depression over time
6The theory
- The scores are translated into ranks.
- The lowest score gets the lowest rank, the next
higher score the next higher rank up to the
highest rank. - If there is no difference in the depression level
between Ecstasy and Alcohol, a similar number of
low and high ranks should be found in each group.
If we add up the ranks, the summed total of ranks
in each group should be about the same. - If there is a difference between the two groups,
e.g., Ecstasy produces higher levels of
depression, one would find higher ranks in the
Ecstasy group and lower ranks in the Alcohol
group.
7Ranking of the data (Sunday and Wednesday)?
Same scores share 'tied ranks'. The actual value
of a tied rank is the average of the ranks that
constitute it. E.g., the actual rank for the tied
ranks 3 and 4 for the 2 occurrences of the score
6 in the Wednesday data is 3,5. The ranks are
summed up for each group and day (AW59 EW151
AS90.5 ES119.5). The lowest of the sums serves
as test statistics.For Wednesday this is WS59.
For Sunday, it is WS90.5.
8The test statistics, mean and SE(Wilcoxon
rank-sum test)
- Lower sum of ranks for Wednesday WS 59
- (WS Wilcoxon sum)?
- Lower sum of ranks for Sunday WS 90.5
- Mean of the test statistics (mean of Wilcoxon
sum, WS) - __
- WS n1(n1n21) 10(10101) 105
- 2 2
- SE of the test statistics (SE of ?WS)
- SEWS ?n1 n2(n1n21)/12
- ???10x10)(10101)/12 13.23
Why 12?
9Test statistic as z-score, significance
- _ __
- Z X-?X WS - WS
- s SEWS
- __
- zSunday WS - WS 90.5 105 -1.10 ns
- SEWS 13.23
- __
- zWednesday WS - WS 59 105 -3.48
- SEWS 13.23
- If the z-scores are 1.96 (irrespective of or
-), then the test is significant. - ? The group difference for Sunday is n.s.,
whereas - ? The group difference for Wednesday is
significant
10Mann-Whitney (U) test
- The Mann-Whitney test is similar to the Wilcoxon
rank-sum test but uses the U test statistic. - U N1N2 N1(N11) - R1
- 2
- USunday (10x10) 10(11) - 119.5 35.50
-
- 2
- UWednesday (10x10) 10(11) -151.0 4.00
- 2
- SPSS produces both statistics. Since they are
related they always say the same. Choose yourself!
R1 sum of ranks of Group 1, here
Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 151
11Data input Ecstacy_Alcand provisional analysis
- For a between subjects test, we need a coding
variable (as in a between subjects t-test), e.g. - 'drug' 1ecstasy 2alcohol
- We then have a column for the dependent variable
BDI on Sunday (sunbdi) and one for BDI on
Wednesday (wedbdi).
12Before running the Analysis Run a test of
normalityAnalyze ? Descriptive Statistics ?
Explore
- Sunbdi and wedbdi go to the dependent list
- 'drug administered' goes to the factor list
- In the plots, tick 'test of normality' for the
test of normality
13Test of Normality
ns ns
ns ns
Non- normal
Normal
Normal
Non- ormal
- Both the K-S test and the Shapiro-Wilk test tell
us that the distribution for Ecstasy-sunbdi and
Alcohol-wedbdi are not normal.
14Decision for a non-parametric test
- As we have seen, some of the distributions are
non-normal. What can you do? - Transform the data (z-, logarithmic, etc.)?
- Choose a non-parametric test
15Homogeneity of variances
The homogeneity test you can request in the
'Options' of a simple One-way ANOVA. It also
comes automatically if you run a t-test for
independent samples.
- Levene's test is n.s. ? the variances of the
Sunday and Wednesday data are equal
16Further DescriptivesAnalyze ? Descriptive
Statistics ? Frequencies or ?
Descriptives
- Request basic descriptive statistics such as the
mean, median, SD, variance. - Note that for a non-parametric test, the median
is a better indicator of the central tendency
than the mean.
17Running the analysis(using your own
Ecstasy_Alc.sav)?
- Analyze ? Nonparametric Test ? 2 independent
samples
Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Exact...
18Specifying the dialog boxes
Define the two levels of the grouping variable
Request 'Descriptives' in the 'Options' box
If you have installed 'Exact Tests', you can
request such an 'Exact Test'
19Exact Test
Exact...
- You may or may not have 'Exact...' in your Main
Dialog Box (I haven't). The 'Ecact Test' is an
extra module of SPSS which needs to be installed. - It enables an Exact test of the significance of
the Kruskal-Wallis test, which is a good thing to
have for small samples. However, it is a very
time-demanding procedure (can take really
long...). - Instead of an 'Exact Test', a less intense test
can be requested based on the 'Monte Carlo'
Method. - In the Monte-Carlo-Method, a distribution similar
to the sample is found and then many samples (up
to 10.000) are created for which the mean
significance value and Confidence Intervals are
computed.
20Other options for the Mann-Whitney test
Do not confuse the K-S test for normality with
the K-S Z-test!!!
- Kolmogorov-Smirnov Z The K-S Z-test test whether
two samples have been drawn from the same
population. In sofar, it does the same as the M-W
test. The K-S Z-test has even better power for
small samples (n - Moses Extreme Reactions This test compares the
variability of scores in the two groups, hence
like a non-parametric Levene test. - Wald-Wolfowitz runs This is a variant of the
M-W-test which looks for 'runs' of scores in a
row from the same group - AAAAAAAAAAAAEEEEAEEEEEEEE
- If the groups are different, runs or ranks for
each group should cluster at different ends of
the distribution.
21Output from Mann-Whitney Test
Mean Ranks Sum of ranks/n
Sum of Ranks are all ranks summed up
22Test statistics of the M-W test
Wilcoxon W the lower WS for sunbdi and wedbdi.
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1)than in the alcohol
users (5.9)?
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1) than in the alcohol
users (5.9)?
23Comparison with t-test for independent samples
Levene's tests for homogeneity of variances OK
Levene's tests for homogeneity of variances OK
T-test statistics
- ? Wilcoxon rank sum test and t-test yield the
same results
24Calculating the effect sizes
- The effect size r can easily be calculated from
the z-scores. - r z
- ?n
- rSunday -1.11 -.25
- ???
- rWednesday -3.48 -.78
- ???
Medium effect
Huge effect
25Reporting the results(Field_2005_532)?
- Ecstasy users (Mdn17.5) didn't seem to differ in
depression levels from alcohol users (Mdn16) the
day after the drugs were taken, U35.5, ns,
r-25. However, by Wednesday, ecstasy users
(Mdn33.5) were significantly more depressed than
alcohol users (Mdn7.5), U4, p
Mdn Median
26Non-parametric tests and statistical power
- With a non-parametric test we avoid the
assumptions of a parametric test, esp. normality.
However, by ranking the scores rather than
computing the scores directly, we lose
information about the magnitude of the difference
between the scores (remember, two ranks do not
tell you anymore how far, numerically, the two
original scores were apart). Therefore, we may
lose statistical power, i.e., we may not detect
an effect which is genuinely there. - However, non-parametric tests are only less
powerful if parametric assumptions are met. Thus,
if you run a parametric and a non-parametric test
over normally-distributed data, then the
non-parametric test may be weaker.
27Non-parametric tests and statistical power
- The problem
- For normally-distributed data Type 1-error rate
is 5. - For non-normally-distributed data we would not
know where 5 of the non-normal distribution are.
It depends on the shape of the distribution.
28Terminology
29Comparing two related conditions The Wilcoxon
Signed-rank test
- The Wilcoxon signed-rank test is used when you
want to compare 2 conditions but within the same
subject. - It is the non-parametric equivalent to the
dependent t-test. - Expl Measuring the differences between the
depression scores on Sunday and Wednesday, from
the previous example. - Note before, we had only tested the difference
between the two groups of Ecstasy and Alcohol
users (between subjects design).
30The theory
- First, the differences between the scores in the
two conditions are obtained, then they are
ranked. -
- Additionally, the sign (positive/negative) of the
difference is assigned to the rank.
31Rankingthe data
The test statistics is the smaller of the summed
ranks T0 for Ecstasy and T8 for Alcohol
32Calculating significance
- General formulas
- ?
- T n(n1) Test statistics
- 4
- SET ?n(n1)(2n1)/24 SE of Test statistics
-
33Calculating significance
- ?T n(n1)/4 Test statistic
- SET ?n(n1)(2n1)/24 SE of Test statistics
- ?TEcstasy 8(81)/4 18
- SET Ecstasy ??(81)(161)/24 7.14
- ?TAlcohol 10(101)/4 27.5
- SET Alcohol ???(101)(201)/24 9.81
Note that there are only n8 in the Ecstacy group
now.
34z-scores
TEcstasy 0 ?TEcstasy 18 SET
Ecstasy 7.14 TAlcohol 8 ?TAlcohol
27.5 SET Ecstasy 9.81
- Z X-?X T - ?T
- s SET
- zEcstasy 0-18 -2.52
- 7.14
- zAlcohol 8-27.5 -1.99
- 9.81
- ? Both values are is a significant difference in depression scores
between Sunday and Wednesday for both drugs,
Ecstasy and Alcohol
35Before running the analysis (using your own
Ecstasy_Alc.sav)?
- Before running the analysis, you have to split
the file for the Ecstasy and the Alcohol group - Data ? Split File ? Organize output by groups
36Running Wilcoxon signed-rank test(using your own
Ecstasy_Alc.sav)?
- Analyze ? Non-parametric tets ? 2-related samples
Exact...
If you have 'Exact' choose 'Asymptotic only'
37Alternatives for Wilcoxon signed-rank test
- In the main dialog window, there are 3
alternative tests which you may choose instead of
Wilcoxon - 1. Sign It only considers the direction of the
differences (pos or neg), irrespective of
magnitude of change. Therefore, it looses power. - 2. McNemar Good for nominal (not ordinal) data,
i.e., two related dichotomous variables. - 3. Marginal Homogeneity Extension of the McNemar
for ordinal data. Equivalent to Wilcoxon. - (My version of SPSS does not have this option)?
38Aside Request Descriptive StatisticsAnalyze ?
Descriptive Statistics ? FrequenciesBefore,
split the files according to 'kind of drug'!
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data.
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data. These are the
outputs for the split data (Ecstasy and Alcohol
separatly)?
39You can also request the medians from the
descriptive statistics of the signed-rank test by
clicking on quartiles in the options
Ecstasy
Alcohol
40Output for Ecstacy
- SPSS first gives the results for the Ecstasy group
There were no neg differences so that Wed Sunday There were 8 pos diff, so that Wed
Sun There were 2 ties (same values) which are
excluded
? All included differences (8 out of 10, since
the 2 ties were excluded) were positive, i.e.,
depression scores were always higher on Wednesday
than they were on Sunday, for Ecstasy.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the neg
ranks since they are the smaller
41Output for Alcohol
- SPSS then gives the results for the Alcohol group
There were 9 neg differences so that Wed Sunday There was 1 pos diff, so that Wed
Sun There were 0 ties (same values)
? 9 out of 10 differences were negative, and 1
was positive, i.e., depression scores were lower
on Wednesday than they were on Sunday, for
Alcohol.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the pos
ranks since they are the smaller ones
42Effects of Ecstasy vs. Alcohol
- For Ecstasy, depression increases from Sunday to
Wednesday. - For Alcohol, depression decreases from Sunday to
Wednesday. - This reverse effect is an interaction!
43Calculating the effect sizes
- Effect sizes for the Wilcoxon signed-rank test
can be calculated from the z-scores - r z
- ?n
- rEcstasy -2.53 -.57
- ???
- rAlcohol 1.99 -.44
- ???
Note although 2 ties in the Ecstasy group
were excluded, here, all 10 subjects are included
in the calculation of the effect size (????
Medium to large effects
Medium to large effects
44Reporting the results(Field_2005_541)?
- For Ecstasy users, depression levels were
significantly higher on Wednesday (Mdn33.5) than
on Sunday (Mdn17.50), T0,p - For Alcohol users, the opposite was true
depression levels were significantly lower on
Wednesday (Mdn7.5) than on Sunday (Mdn16), T8,
p
45Terminology
46Differences between several independent groups
The Kruskal-Wallis Test
- The Kruskal-Wallis Test is the non-parametric
equivalent to a Simple One-way independent ANOVA. - Example
- Background It has been claimed that the chemical
'genistein' which naturally occurs in soya
products decreases the number of sperms in males. - Research question Do groups of male subject who
eat various amounts of soya meals per week have
different amounts of sperm after a year's period ?
47The variables
- Independent variable number of soya meals
- (1) no soya meals (control condition) 0 per
year - (2) 1 soya meal per week - 52 per year
- (3) 4 soya meals per week - 208 per year
- (4) 7 soya meals per week - 364 per year
- Each group consisted of 20 different male
individuals. - Dependent variable number of sperms
48The Theory of the Kruskal-Wallis Test
- As the other non-parametric tests, the K-W Test
is also based on ranked data. - First, the scores are ranked,irrespective of
group memebership. - Then, for each group, their ranks are added. The
sum of ranks for each group is Ri.
49Ranked data for the soya experiment
50The Test Statistic H
- k
- H 12 Si1 R2i - 3 (N1)
- N(N1) ni
- H 12 9272 8832 8832 5472
- 3(81) - 80(81) 20 20 20 20
- 12 (42,966.45 38,984.45 38,984.45
14,960.45) -243 6480 - 0.0019 (135,895.8) 243
- 251.66 243 8.659
H has a ?2 distribution Its df k-1 where k
of groups, hence df 4-13
H Hcritical (3) 7.81, pdistribution)?
51Data input
- As for a One-Way ANOVA, we code the different
groups with a dummy coding variable 'Soya' in the
1st column - (1) no soya
- (2) 1 soya meal
- (3) 4 soya meals
- (4) 7 soya meals
- The dependent variable 'sperm' goes in the 2nd
column
52The data in SPSS(Soya.sav)?
Dummy coding 1,2,3,4
Dep Var 'sperm'
ranks
53Exploratory analyses
- Analyze ? Descriptive Statistics ? Explore
- tick 'Test of Normality' in 'Statistics'
Most groups show non-normal data distributions
Analyze ? General Linear Model ? Univariate, tick
'Homogeneity test' in 'options'
Levene's test is significant ? heterogeneous varia
nces
The data violate two parametric assumptions
Normal distribution of data Homogeneous
variances. Therefore, a non-parametric test is
advised.
54Running the Kruskal-Wallis test
- Analyze ? Nonparametric Tests ?
- K-Independent Samples...
Define the range of the independent Var 1-4
levels of soya meals
Exact...
Jonckheere-Terpstra
If you have, choose 'Monte Carlo'
If you have, select 'Jonckheere-Terpstra'. This
is for a linear trend in the data
55Output Kruskal-Wallis Test
Mean Ranks for all levels of '' of soya meals'
Main Test Statistics H (here, called
Chi-Square)? H is significant If you have
requested 'Monte Carlo', the result will also be
displayed here.
- ? Number of soya meals has a significant effect
on sperm count, overall. - However, we do not know where the difference is
exactly located.
56Boxplots for the 4 groupsGraphs ? Boxplots
- Visual inspection
- The Medians for groups 1-3
- seem rather similar however,
- the Median for group 4
- seems somewhat lower
How can we know which particular difference(s)
brought about the overall difference?
571. Posthoc Tests for Kruskal-Wallis
- 1. Posthoc tests in nonparametric tests can be
done with the Mann-Whitney test (for pairs of
unrelated samples). - If we want to do Posthoc tests, we risk inflating
Type I error. - In order to correct for family-wise error
inflation, we may use the Bonferroni correction.
However, then we loose power. - 2. Posthoc tests in nonparametric tests can be
done by hand
581. Posthoc Tests for Kruskal-Wallis
- ? Compromise do only a few promising
comparisons, e.g. Each level against the control
condition (as in 'simple' contrasts)? - Test 1 no soya vs. 1 soya meal
- Test 2 no soya vs. 4 soya meals
- Test 3 no soya vs. 7 soya meals
- With 3 tests, we have to divide our ?-level by 3,
- .05/3 .0167
- So we are doing our Posthoc tests on this more
rigorous level.
591. Single Mann-Whitney tests forthe three
comparisons
Analyze ? Nonparametric Tests ? 2-Independent
tests, define the groups in the grouping
variables window group 1 vs 2 1 vs 3 1 vs 4.
(Here, the contrast 1 vs 4 is requested.)? Each
Mann-Whitney test is carried out indepdently
601. Output of the Single Mann-Whitney tests for
the three comparisons
Group 1 vs 2 (no vs 1 soya meal) n.s
Group 1 vs 3 (no vs 4 soya meals) n.s
Group 1 vs 4 (no vs 7 soya meals)
? Eating only 1 to 4 soya meals a week does not
affect number of sperms as compared to not eating
soya meals at all. However, eating 7 soya meals a
week significantly diminishes number of sperms.
612. Posthoc Tests in nonparametric tests(for
nerds)?
- You can also calculate the differences for all
pairs of contrasts by hand. - You take the difference between the mean ranks of
the different groups and compare them to a value
based on the value of z (corrected for the number
of comparisons you make) and a constant based on
the total sample size and the sample size in the
2 groups being compared. - ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))?
K number of groups (4)? N total sample size
(80)? nu number of subj in 1st group (20)? nv
number of subj in 2nd group (20)?
- Difference between the
- mean rank of the 2 groups,
- ignoring the /- sign only the
- ?absolute value? is considered
62Determining the critical difference for z
- ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))? - In order to know the value for z??k(k-1) , we
need to determine the ??level. Normally, it is
.05. This level needs to be divided by 12 which
is k(k-1) where k is the number of groups, that
is, 4x312. The ??level therefore is .05/12
.00417. Now, z??k(k-1) means 'the value of z for
which only .00417 other values of z are bigger'. - Looking up in Appendix A.1 (normal
z-distribution) the smaller portion for .00417
(actually, .00145), we find the value of z2.64.
This is the crit value.
63Determining the critical difference for z
- ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))? - crit. Diff 2.64 ? (80(801)/12) (1/20 1/20)?
- crit. Diff 2.64 ? 540(0.1)
- crit. Diff 2.64 ? 54
- crit. Diff 19.4
- Since sample sizes for all groups are identical,
this value holds for all comparisons. - We now can test the actual differences in mean
ranks for all comparisons against this critical
difference. If a value is bigger, then the
comparison is significant.
64Testing individual differences in mean rank
against the critical difference (19.4)?
- According to this calculation, none of the
differences is significant! However, in the
previous Mann-Whitney test the 'No meals 7
meals' had been significant. How come?
65Significant or ns comparisons?
- In the old calculation we had to divide our
overall ? level into 3 portions. - In the old M-W test we had only conducted 3
comparisons which yields a corrected ? of
.05/3.0167. The ? of the 'no vs 7 meals'
comparison had been .009 which is smaller than
.0167. - In the new comparison, however, we have an ? of
.05/6 (for all 6 comparisons) .0083. Now .009
.0083, hence the comparison is n.s. - ? Better carry out only a few reasonable
comparisons
66Testing for trends the Jonckheere-Terpstra test
- This test looks at the differences between the
medians of the groups, just as the
Kruskall-Wallis test does. - Additionally, it includes information about
whether the medians are ordered. - In our example, we predict an order for the
number of sperms in the 4 groups, indeed - no meal 1 meal 4 meals 7 meals
- In the coding variable, we have already encoded
the order which we expect (1234)?
67Output of the J-T test
If you have J-T in your version of SPSS,
it would look like this
Z-score (912-1200)/116.33-2.476
J-T test should always be 1-tailed (since we have
a directed hypo!) We compare -2.47 against 1.65
which is the z-value for an ?-level of 5 for a
1- tailed test. Since 2.471.65 the result is
significant. The negative sign means that medians
are in descending order (a positive sign would
have meant ascending order).
68Calculating effect sizes
- Calculate only effect sizes for single focused
comparisons - r z
- ?2n
- rNoSoya 1 meal -0.243/?40 -.04
- rNoSoya 4 meal -0.325/?40 -.05
- rNoSoya 7 meal -2.597/?40 -.41
- rJonckheere -2.47/??0 -.28
Neglegible effects
Negligible effects
Medium effects
Medium effects
69Reporting the results of the Kruskal-Wallis Test
(Field_2005_556)?
- Sperm counts were significantly affected by
eating soya meals (H(3) 8.66, p Mann-Whitney Tests were used to follow up this
finding. A Bonferroni corrrection was applied and
so all effects are reported at a .0167 level of
significance. It appeared that sperm counts were
no different when one soya meal (U191, r-.04)
or four soya meals (U188, r -.05) were eaten
per week compared to none. However, when seven
soya meals were eaten per week, sperm counts were
significantly lower than when no soya was eaten
(U104, r-.41).
70Terminology
71Differences between several related groups
Friedman's ANOVA
- Friedman's ANOVA is the non-parametric analogue
to a repeated measure ANOVA (see chapter 11)
where the same subjects have been subjected to
various conditions. - Example here Testing the effect of a new diet
called 'Andikins diet' on n10 women. Their
weight (in kg) was tested 3 times - Start
- Month 1
- Month 2
- Would they loose weight in the course of the diet?
http//goc-frankfurt.de/images/dualit/personenwaag
e_450.jpg
72Theory of Friedman's ANOVA
- Subject's weight on each of the 3 dates is listed
in a separate column. Then ranks for the 3 dates
are determined and listed in separate columns. - Then, the ranks are summed up for each Condition
(Ri)?
Always the 3 scores are compared The
smallest one gets 1, the next 2, and the
biggest one 3.
Diet data with ranks
73The Test statistic Fr
- From the sum of ranks for each group, the test
statistic Fr is derived - k
- Fr 12/Nk (k1) Si1 R2i - 3N(k1)?
- (12/(10x3)(31)) (192 202 212))
(3x10)(31)? - 12/120 (361400441) 120
- 0.1 (1202) 120
- 120.2 - 120 0.2
74 Data Input and provisional analysis (using)
diet.sav
- First, test for normality
- Analyze ? Descriptive Statistics ? Explore, tick
'Normality plots with tests' in the 'Plots' window
Data sheet
In the Shapiro-Wilk test (which is more accurate
than the K-S Test, two groups (Start, 1 month)
show non-normal distributions. This violation of
a parametric constraint justifies the choice of a
non-para-metric test.
75Running Friedman's ANOVA
- Analyze ? Non-parametric Tests ? K Related
Samples...
If you have 'Exact', tick 'Exact and limit
calculation time to 5 minutes.
Other options
Other options
Exact...
Request everything there is - it is not much...
76Other options
- Kendall's W Similar to Friedman's ANOVA, but
looks specifically at agreement between raters.
For example to what extent (from 0-1) women rate
Justin Timberlake, David Beckham, or Tony Blair
on their attractiveness. This is like a
correlation coefficient. - Cochran's Q This is an extension of NcNemar's
test. It is like a Friedman's test for
dichotomous data. For example, if women should
judge whether they would like to kiss Justin
Timberlake, David Beckham, or Tony Blair and they
could only answer Yes or No.
77Output from Friedman's ANOVA
The F-Statistics is called Chi-Square, here. It
has df2 (k-1, where k is the of groups). The
statistics is n.s.
78Posthoc tests for Friedman's ANOVAWilcoxon
signed-rank tests but correcting for the numbers
of tests we do, here ? .05/3.0167.
Analyze ? Nonparametric Tests ? 2-Related Tests,
tick 'Wilcoxon', specify the 3 pairs of groups
Mean ranks and sum of ranks for all 3 comparisons
So, actually, we do not have to calculate any
further...
All comparisons are ns, as expected from the
overall ns effect.
79Posthoc tests for Friedman's ANOVA- calculation
by hand
- We take the difference between the mean ranks of
the different groups and compare them to a value
based on the value of z (corrected for the of
comparions) and a constant based on the total
sample size (n10) and the of conditions (k3)? - ??Ru - ?Rv??z??k(k-1) ? k(k1)/6N
- z??k(k-1) .05/3(3-1) .00833
- If the difference is significant, it should have
a higher value than the value of z for which only
.00833 other values of z are bigger. As before,
we look in the Appendix A.1 under the column
Smaller Portion. The number corresponding to
.00833 is the critical value it is between 2.39
and 2.4.
k(k-1) 3 (3-1) 6
80Calculating the critical differences
- Critical difference z??k(k-1) ? k(k1)/6N
- crit. Diff 2.4 ? (3(31)/6x10
- crit. Diff 2.4 ? 12/60
- crit. Diff 2.4 ? 0.2
- crit. Diff 1.07
- ? If the differences between mean ranks are ?
the critical difference 1.07, then that
difference is significant.
81Calculating the differences between mean ranks
for diet data
- ? None of the differences is ? the critical
difference 1.07, hence none of the comparisons is
significant.
82Calculating the effect size
- Again, we will only calculate the effect sizes
for single comparisons
- r z
- ?2n
- rStart 1 month -0.051/??? -.01
- rStart 2 months -0.255/??0 -.06
- r1 month 2 months -0.153/??0 -.03
Tiny effects
Tiny effects
Tiny effects
83Reporting the results of Friedman's ANOVA
(Field_2005_566)?
- The weight of participants did not significantly
change over the 2 months of the diet (?2(2)
0.20, p .05). Wilcoxon tests were used to
follow up on this finding. A Bonferroni
correction was applied and so all effects are
reported at a .0167 level of significance. It
appeared that weight didn't significantly change
from the start of the diet to 1 month, T27,
r-.01, from the start of the diet to 2 months,
T25, r-.06, or from 1 month to 2 months,
T26,r-0.3. We can conclude that the Andikinds
diet (...) is a complete failure.
84Summary Terminology