Title: ANOVA
1- ANOVA
- Determining Which Means Differ in Single Factor
Models
2Single Factor ModelsReview of Assumptions
- Recall that the problem solved by ANOVA is to
determine if at least one of the true mean values
of several different treatments differs from the
others. - For ANOVA we assumed
- The distribution of the population for each
treatment is normal. - The standard deviations of each population,
although unknown, are equal. - Sampling is random and independent.
3Determining Which Means DifferBasic Concept
- Suppose the result of performing a single factor
ANOVA test is a low p-value, which indicates that
at least one population mean does, in fact,
differ from the others. - The natural question is, Which differ?
- The answer is that we conclude that two
population means differ if their two sample means
differ by a lot. - The statistical question is, What is a lot?
4Example
- The length of battery life for notebook computers
is of concern to computer manufacturers. - Toshiba is considering 5 different battery models
(A, B, C, D, E) that have different costs. - The question is, Is there enough evidence to
show that average battery life differs among
battery types?
5Data
- A B C D E
- 130 90 100 140 160
- 115 80 95 150 150
- 130 95 110 150 155
- 125 98 100 125 145
- 120 92 105 145 165
- 110 85 90 130 125
- ?x 121.67 90 100 140 150
Grand Mean ?x 120
6(No Transcript)
7OUTPUT
8Motivation for The Fisher Procedure
- Fishers Procedure is a natural extension of the
comparison of two population means when the
unknown variances are assumed to be equal - Recall this is an assumption in single factor
ANOVA - Testing for the difference of two population
means (with equal but unknown ss) has the form - H0 µ1 µ2 0
- HA µ1 µ2 ? 0
9Best Estimate for s2 and the Appropriate Degrees
of Fredom
- Recall that when there were only 2 populations,
the best estimate for s2 is sp2 and the degrees
of freedom is (n1-1) (n2-1) or n1 n2 - 2. - For ANOVA, using all the information from the k
populations the best estimate for s2 is MSE and
the degrees of freedom is DFE.
Two populations With Equal
Variances ANOVA Best estimate for s2 sp2
MSE Degrees of Freedom n1 n2 2
DFE
10Two Types of Tests
- There are two types of tests that can be applied
- A test or a confidence interval for the
difference in two particular means - e.g. µE and µB
- A set of tests which determine differences among
all means. - This is called a set of experiment wise (EW)
tests. - The approach is the same.
- We will illustrate an approach called the Fisher
LSD approach. - Only the value used for a will be different.
11Determining if µi Differs From µjFishers LSD
Approach
LSD stands for Least Significant Difference
12When Do We Conclude Two Treatment Means (µi and
µj) Differ?
- We conclude that two means differ, if their
sample means,?xi and?xj, differ by a lot. - A lot is LSD given by
13Confidence Intervals for the Difference in Two
Population Means
- A confidence interval for µi µj is found by
14Equal vs. Unequal Sample Sizes
- If the sample sizes drawn from the various
populations differ, then the denominator of the
t-statistic will be different for each pairwise
comparison. - But if the sample sizes are equal (n1 n2 n3
.) , we can designate the equal sample size by N - Then the t-test becomes
15LSD For Equal Sample Sizes
16What Do We Use For a?
- Recall that a is
- In Hypothesis Tests the probability of
concluding that there is a difference when there
is not. - In Confidence Intervals the probability the
interval will not contain the true difference in
mean values - If doing a single comparison test or constructing
a confidence interval, - For an experimentwise comparison of all means,
- We will actually be conducting 10 t-tests
- µE - µD, (2) µE - µC, (3) µE - µB, (4) µE - µA,
(5) µD - µC, - (6) µD - µB, (7) µD - µA, (8) µC µB, (9) µC -
µA, (10) µB - µA
select a as usual
Use aEW
17aEW The probability ofMaking at least one Type
I Error
- Suppose each test has a probability of concluding
that there is a difference when there is not
(making a Type I error) a. - Thus for each test, the probability of not making
a Type I error is 1-a. - So the probability of not making any Type I
errors on any of the 10 tests is (1- a)10 - For a .05, this is (.95)10 .5987
- The probability of making at least one Type I
error in this experiment, is denoted by aEW. - Here, aEW 1 - .5987 .4013 -- That is, the
probability we make at least one mistake is now
over 40! - To have a lower aEW, a for each test must be
significantly reduced.
18The Bonferroni Adjustment for a
- To make aEW reasonable, say .05, a for each test
must be reduced. - The Bonferroni Adjustment is as follows
- NOTE decreasing a, increases ß, the probability
of not concluding that there is a difference
between two means when there really is. Thus,
some researchers are reluctant to make a too
small because this can result in very high ß
values.
19What Should a for Each Test Be?
The required a values for the individual t-tests
for aEW .05 and aEW .10 are
For aEW .05 For aEW .05
Number of Treatments, k a for each test
3 0.01667
4 0.00833
5 0.00500
6 0.00333
7 0.00238
8 0.00179
9 0.00139
10 0.00111
For aEW .10 For aEW .10
Number of Treatments, k a for each test
3 0.03333
4 0.01667
5 0.01000
6 0.00667
7 0.00476
8 0.00357
9 0.00278
10 0.00222
20LSDEWFor Multiple Comparison Tests
- When doing the series of multiple comparison
tests to determine which means differ, the test
would be to conclude that µi differs from µj if
- Where LSDEW is given by
21Procedure for Testing Differences Among All Means
- We begin by calculating LSDEW which we have shown
will not change from test to test if the sample
sizes are the same from each sample. That is the
situation in the battery example that we
illustrate here. - A different LSD would have to be calculated for
each comparison if the sample sizes are
different.
22Procedure (continued)
- Then construct a matrix as follows
23Procedure (continued)
- Fill in mean of each treatment across the top row
and down the left-most column (in our example,
XA 121.67, XB 90, XC 100,
XD 140, XE 150) -
24Procedure (continued)
- For each cell below the main diagonal, compute
the absolute value of difference of the means in
the corresponding column and row
25Procedure (continued)
- Compare each difference with LSDEw(17.235 in our
case). If the - difference between and gt LSDEw. we can
conclude that there is difference in µi and µj.
26Tests For the Battery Example
- For the battery example,
- Which average battery lives can we conclude
differ? - Give a 95 confidence interval for the difference
in average battery lives between - C batteries and B batteries
- E batteries and B batteries
Use LSDEW Multiple Comparisons
Use LSD Individual Comparisons
27Battery Example Calculations
- Experimental error of ?EW .05
- For k 5 populations, a aEW /10 .05/10
.005 - From the Excel output
- ? xE 150, ? xD 140, ? xA 121.67,? xc
100, ? xB 90 - MSE 94.05333, DFE 25, N 6 from each
population - Use TINV(.005,25) to generate t.0025,25
3.078203
28Analysis of Which Means Differ
- We conclude that two population means differ if
their sample means differ by more than LSDEW
17.2355. - Construct a matrix of differences,
- Compare with LSDEW
-
29Conclusion of Comparisons
30LSD For Confidence Intervals
- Confidence intervals for the difference between
two mean values, i and j, are of the form - (Point Estimate) t?/2,DFE(Standard Error)
31LSD for Battery Example
32The Confidence Intervals
- 95 confidence interval for the difference in
mean battery lives between batteries of type C
and batteries of type B. - 95 confidence interval for the difference in
mean battery lives between batteries of type E
and batteries of type B.
33REVIEW
- The Fisher LSD Test
- What to use for
- Best Estimate of s2 MSE
- Degrees of Freedom DFE
- Calculation of LSD
- Bonferroni Modification
- Modify a so that aEW is reasonable
- a aEW/c, where the of tests, c k(k-1)/2
- Calculation of LSDEW
- Excel Calculations