Title: Statistics 303
1Statistics 303
2ANOVA Comparing Several Means
- The statistical methodology for comparing several
means is called analysis of variance, or ANOVA. - In this case one variable is categorical.
- This variable forms the groups to be compared.
- The response variable is numeric.
- This methodology is the extension of comparing
two means.
3ANOVA Comparing Several Means
- Example
- An experimenter is interested in the effect of
sleep deprivation on manual dexterity.
Thirty-two subjects are selected and randomly
divided into four groups of size 8. - After differing amount of sleep deprivation, all
subjects are given a series of tasks to perform,
each of which requires a high amount of manual
dexterity. A score form 0 to 10 is obtained for
each subject. Test at the a 0.05 level the
hypothesis that the degree of sleep deprivation
has no effect on manual dexterity.
4ANOVA Comparing Several Means
Sample size N 32
If H0 is true, there is no difference among the
group means, the two variations will be almost
equal. This is the idea of ANOVA.
5ANOVA Comparing Several Means
Variation Within Groups
Average Within Group Variation (MSE)
6ANOVA Comparing Several Means
Average Between Group Variation (MSG)
Variation Between Groups
7ANOVA Comparing Several Means
- The F-statistic
-
- where MSG mean squares group, MSE mean
squares error. -
- This compares the variation between groups to
the variation within groups. This is what gives
it the name Analysis of Variance. - The degrees of freedom for the F test are
- df1 I 1 (number of groups minus 1)
- df2 N I (total sample size minus number of
groups). - Q Under null hypothesis, what should F
approximately equal to? - Under the alternative?
8ANOVA Table
is the proportion of the total variation
explained by the difference in means
We can get this table by SPSS.
9Assumptions for ANOVA
- Suppose we have I populations,
- 1. Each of the I population or group
distributions is normal. - -check with a normal quantile (Q-Q) plot of
each group - 2. These distributions have identical
variances - -check if largest std. dev. is gt 2 times
smallest std. dev. - 3. Each of the I samples is a random sample.
- 4. Each of the I samples is selected
independently of one another.
10ANOVA Comparing Several Means
- Step 1 The null hypothesis for comparing several
means is
where I is the number of populations to be
compared
- Step 2 The alternative hypothesis (step 2) is
11ANOVA Comparing Several Means
- Step 3 State the significance level
- Step 4 Calculate the F-statistic
- Step 5 Find the P-value
- The P-value for an ANOVA F-test is always
one-sided. - The P-value is
- where df1 I 1 and df2 N I.
F-distribution
12ANOVA Comparing Several Means
- Step 6. Reject or fail to reject H0 based on the
P-value. - If the P-value is less than or equal to a, reject
H0. - It the P-value is greater than a, fail to reject
H0. - Step 7. State your conclusion.
- If H0 is rejected, There is significant
statistical evidence that at least one of the
population means is different from another. - If H0 is not rejected, There is not significant
statistical evidence that at least one of the
population means is different from another.
13ANOVA Comparing Several Means
- Go back to Example
- Categorical sleep deprivation (4 levels).
- Numeric performance in dexterity
- of groups 4 (I 4)
- total sample size 32 (N 32)
- 8 for each group (ni 8)
-
- Test at the a 0.05 level the hypothesis that
the degree of sleep deprivation has no effect on
manual dexterity.
14Side by Side Boxplots
15Check assumptions
- Normality normal quantile plots
16Check assumptions
2. Equal variances
3. Each of the I samples is a random sample.
4. Each of the I samples is selected
independently of one another.
17ANOVA Comparing Several Means
- Step 1 The null hypothesis is
Step 2 The alternative hypothesis is
Step 3 The significance level is a 0.05
18ANOVA Comparing Several Means
- Step 4 Calculate the F-statistic
MSG and MSE are found in the ANOVA table when the
analysis is run on the computer
19ANOVA Comparing Several Means
- Step 5 Find the P-value
- The P-value is
where df1 I 1 (number of groups minus 1)
4 1 3 and df2 N I (total sample size
minus I) 32 4 28
20ANOVA Comparing Several Means
- Step 6. Reject or fail to reject H0 based on the
P-value. - Because the P-value is less than a 0.05, reject
H0. - Step 7. State your conclusion.
- There is significant statistical evidence that
at least one of the population means is different
from another.
An additional test will tell us which means are
different from the others.
21ANOVA
- Notice that
- (Sum of Squares Between Groups) (Sum of
Squares Within Groups) (Sum of Squares Total) - 71.928 18.789 90.717
- Also notice that the Mean Square Column is
Calculated by dividing the Sum of Squares by the
associated Degrees of Freedom (df). - Ex. 71.928 / 3 23.976 for Between Groups
- F (MS Between Groups) / (MS Within Groups)
MSG/MSE - 23.976 / .671 35.730