Title: Multiple%20Comparisons:%20Example
1Multiple Comparisons Example
Study Objective Test the effect of six
varieties of wheat to a particular race of stem
rust. Treatment Wheat Variety Levels A(i1),
B (i2), C (i3), D (i4), E (i5), F (i6)
Experimental Unit Pot of well mixed potting
soil. Replication Four (4) pots per treatment,
four(4) plants per pot. Randomization Varieties
randomized to 24 pots (CRD) Response Yield
(Yij) (in grams) of wheat variety(i) at maturity
in pot (j). Implementation Notes Six seeds of a
variety are planted in a pot. Once plants emerge,
the four most vigorous are retained and
inoculated with stem rust.
2Statistics and AOV Table
Rank Variety Mean Yield 5 A 50.3 4 B 69.0 6 C
24.0 2 D 94.0 3 E 75.0 1 F 95.3
n1n2n3n4n5n4
ANOVA Table Source df MeanSquare F Variety 5 2976.
44 24.80 Error 18 120.00
3Overall F-test indicates that we reject H0 and
assume HA
Which mean is not equal to which other means.
Consider all possible comparisons between
varieties
First sort the treatment levels such that the
level with the smallest sample mean is first down
to the level with the largest sample mean.
Then in a table (matrix) format, compute the
differences for all of the t(t-1)/2 possible
pairs of level means.
4Differences for all of the t(t-1)/215 possible
pairs of level means
Largest Difference
Smallest difference
Question How big does the difference have to be
before we consider it significantly big?
5Fishers Protected LSD
F24.8 gt F5,18,.052.77 --gt F is significant
Implies that the two treatment level means are
statistically different at the a 0.05 level.
c
a
c
b
d
d
Alternate ways to indicate grouping of means.
6Tukeys W (Honestly Significant Difference)
Not protected hence no preliminary F test
required.
Table 10
Implies that the two treatment level means are
statistically different at the a 0.05 level.
a
bc
b
d
d
c d
7Student-Newman-Keul Procedure (SNK)
Not protected hence no preliminary F test
required.
Table 10 row Error df18 a 0.05 col r
neighbors
One between
Two between
8SNK
Implies that the two treatment level means are
statistically different at the a 0.05 level.
a
c
b
d
d
c
9Duncans New Multiple Range Test (Passe)
Not protected hence no preliminary F test
required.
Table 11 (next pages) row error df 18 a
0.05 col r
neighbors
One between
Two between
10Duncans Test Critical values
11(No Transcript)
12Duncans MRT
Implies that the two treatment level means are
statistically different at the a 0.05 level.
a
c
b
d
d
c
13Scheffés S Method
F24.8 gt F5,18,.052.77 gt F is significant
For comparing
Reject Ho l0 at a0.05 if
Since each treatment is replicated the same
number of time, S will be the same for comparing
any pair of treatment means.
14Scheffes S Method
Any difference larger than S28.82 is significant.
Implies that the two treatment level means are
statistically different at the a 0.05 level.
a
b c
a b
c
c
b c
Very conservative gt Experimentwise error driven.
15Grouping of Ranked Means
LSD
SNK
Duncans
Tukeys HSD
Scheffes S
Which grouping will you use?
1) What is your risk level? 2) Comparisonwise
versus Experimentwise error concerns.
16So, which MC method should you use?
- There is famous story of a statistician and his
two clients - Client 1 arrives daily with his hypothesis test
and asks for assistance. The statistician helps
him using a0.05. After 1 year they have done 365
tests. If all nulls tested were indeed true, they
would have made approx - (365)(0.05) 18
- erroneous rejections, but they are satisfied with
the progress of the research. - Client 2 saves all his statistical analysis for
end of the year, and approaches the statistician
for help. The statistician responds - My! You have a terrible multiple comparisons
problem! - In cases where the researcher is just searching
the data (does not have an interest in every
comparison made), some form of error rate control
beyond the simple Fishers LSD may be
appropriate. On the other hand, if you
definitely have an interest in every comparison,
it may be better to use LSD (and accept the
comparison-wise error rate).
17Which method to use? Some practical advice
- If comparisons were decided upon before examining
the data (best) - Just one comparison use the standard
(two-sample) t-test. (In this case use
the pooled estimate of the common variance, MSE,
and its corresponding error df. This is just
Fishers LSD.) - Few comparisons use Bonferroni adjustment to
the t-test. With m comparisons, use ?/m for the
critical value. - Many comparisons Bonferroni becomes
increasingly conservative as m increases. At
some point it is better to use Tukey (for
pairwise comparisons) or Scheffe (for contrasts).
- If comparisons were decided upon after examining
the data - Just want pairwise comparisons use Tukey.
- All contrasts (linear combinations of treatment
means) use Scheffe.