Title: Statistical decisionmaking with two or more means
1Statistical decision-making with two or more means
2Hypothesis testing with two or more means
- Why a new statistical test?
- Analysis of variance theory
- F ratio
- Breakdown of sums of squares and degrees of
freedom - Hypothesis testing with ANOVA
- Independent samples ANOVA table
- Dependent samples ANOVA table
- Which means are different from which Tukeys
HSD.
3Testing two or more means why a new test?
- Why test more than two means?
- Efficiently test multiple levels of one IV (dose
effects, etc) - Examine the effects of more than one IV
- Why not use lots of t-tests?
- With every single t-test, p(Type I error) lt .05
- With two t-tests, you risk a Type I error on one
or the other. - Multiple t-tests cause the p(Type I error) to go
over .05! - Need a test that can compare all means in a study
while holding p(Type I error) to lt .05 ANOVA and
the F test!
4A preliminary look at ANOVA and the F ratio
- ANOVA Analysis of variance.
- Used for testing the differences between 2 or
more (up to k) means. - H0 m1 m2 m3 mk
- HA m1 m2 m3 mk
- Statistic is the F ratio made up of two separate
estimates of s - Variance between the sample means numerator
- (between groups variance).
- Mean of the sample variances denominator.
- (within groups variance)
- When H0 is true, F ratio is close to 1 (because
BGvar WGvar) - When HA is true F ratio gets bigger than 1
(because BGvar gt WGvar)
5A preliminary look at ANOVA and the F ratio
- ANOVA Analysis of variance.
- Used for testing the differences between 2 or
more (k) means. - H0 m1 m2 m3 mk
- HA m1 m2 m3 mk
- Statistic is the F ratio made up of two separate
estimates of s - Variance between the sample means numerator
- (between groups variance).
- Mean of the sample variances denominator.
- (within groups variance)
- When H0 is true, F ratio is between 0 and 1
(because BGvar WGvar) - When HA is true F ratio gets bigger than 1
(because BGvar gt WGvar)
6ANOVA for independent samples example and theory
- Consider the following example You have a
medication that you think adds points to peoples
IQ scores. You ask 9 people to agree to
participate in an experiment where 3 will get
placebo, 3 will get one dose of the medication
and 3 will get a double dose of the medication.
You reason that, if the medication has no effect
on IQ (H0 is true) the means of the three groups
IQ scores should be equal (m1 m2 m3) but if
the medication does have an effect on IQ scores
(HA is true), the means of the three groups IQ
scores for the should not be equal (m1 m2
m3). - Lets look at some possible data that you might
expect under the two hypotheses, H0 and HA. - Please note This example uses a small sample
size for pedagogical purposes the sample size
described above for is way too low to be
realistic (i.e., it is an underpowered experiment
as described)!
7Sample data if H0 is true.
8Sample data if H0 is true.
Variance of means ?
Mean of variances ?
Variance of means n Between group
variance F or Mean of variances
Within group variance
9Sample data if H0 is true.
Variance of means 0
Mean of variances (1 1 1)/3
Variance of means n Between group
variance 0 F or or 0
Mean of variances Within group variance
1
10 Small values of F are a dime a dozen, big values
are rare!
Rejection Region (actual critical value based on
degrees of freedom)
11Sample data if HA is true.
12Sample data if H0 is true.
Variance of means ?
Mean of variances ?
Variance of means n Between group
variance F or Mean of variances
Within group variance
13Sample data if H0 is true.
Variance of means 25
Mean of variances (1 1 1)/3
Variance of means n Between group
variance 75 F or or 75
Mean of variances Within group
variance 1
14 Small values of F are a dime a dozen, big values
are rare!
Rejection Region (actual critical value based on
degrees of freedom)
15So . . .
- Statistic is the F ratio made up of two separate
estimates of s - Variance between the sample means numerator
- (between groups variance).
- Mean of the sample variances denominator.
- (within groups variance)
- When H0 is true, F ratio is between 0 and 1
(because BGvar WGvar) - When HA is true F ratio gets bigger than 1
(because BGvar gt WGvar)
16Hypothesis testing with two or more means
- Why a new statistical test? To maintain p(Type I
error) lt .05 when testing more than two means? - Analysis of variance theory
- F ratio BGvar/WGvar when HA is true, numerator
goes up, denominator stays the same. - Breakdown of sums of squares and degrees of
freedom - Hypothesis testing with ANOVA
- Independent samples ANOVA table
- Dependent samples ANOVA table
- Which means are different from which Tukeys
HSD.
17Breakdown of SS and df for independent ANOVA
- Quick review variance of any group of scores is
SS/df - or S(X-X)2/N-1
- In ANOVA, there are multiple variance estimates,
so there are multiple SS and df - There is the total SS (SST), when all scores are
treated as one big group of numbers, and a
corresponding total df (N-1) - There is the between groups SS (SSB), concerned
only with the means of each group, it also has a
df (k-1) - There is the within groups SS (SSW), concerned
with data within each group, it also has a df
(N-k).
18Sample data if HA is true.
SST (104-110)2 (105-110)2 (106-110)2
(109-110)2 (110-110)2 (111-110)2
(114-110)2 (115-110)2 (116-110)2
156
SST The sum of the square of each score
subtracted from the grand mean dfT N-1 Total
variance is then SST/dfT 156/8 19.5
19Breakdown of SS and df for independent ANOVA
Generally
SST
dfT N-1
SSB
SSW
dfB k-1
dfW N-k
Observed F ratio BGvar/WGvar
(SSB/dfB)/(SSW/dfW)
Different subjects in different groups.
20Breakdown of SS and df for independent ANOVA
Generally
SST
dfT
SSB
SSW
dfB
dfW
Observed F ratio BGvar/WGvar
(SSB/dfB)/(SSW/dfW)
For our example
SST 156
dfT 8
SSB 150
SSW 6
dfB 2
dfW 6
F ratio BGvar/WGvar (150/2)/(6/6) 75/1 75.0
21All that information can be put into a table
Typical one-factor, independent groups ANOVA
table
Aka Mean Squared or MS
Source SS df s2 Fobt Between
Groups 150.0 2 75.0 75.0 Within groups
6.0 6 1.0 Total 156.0 8
Is the probability of obtaining an F of 75 lt
.05? As with t test, need to use degrees of
freedom and a table (Table F)
22All that information can be put into a table
Typical one-factor, independent groups ANOVA
table
Source SS df s2 Fobt p Between
Groups 150.0 2 75.0 75.0 lt .01 Within groups
6.0 6 1.0 Total 156.0 8
Plt.05
Plt.01
Is the probability of obtaining an F of 75 lt .05?
Yes Fobt(2,6) 75, which is greater than
Fcrit (2,6) 5.14 or 10.92 Therefore, reject
H0!
23Sample test problem independent ANOVA
- An educational psychologist was interested in the
spelling performance of children at different
grade levels in her school district. She
randomly selects a sample of 6 children from each
of grades 2, 4, 6, and 8 in her school district.
She gives the 24 children a 20-word spelling test
and counts the number of spelling errors made.
Using the information below, help her determine
if spelling ability is or is not equal among the
grade levels.
Source SS df s2 Fobt p Between Groups
82.83 Within groups
Total 157.83
X2 12.17, X4 10.00, X6 7.33, X8 8.17
24Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(?,?) ? - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 Within groups
Total 157.83
25Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(?,?) ? - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 3 Within groups
75.00 20 Total 157.83 23
26Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(?,?) ? - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 3 27.61 Within groups
75.00 20 3.75 Total 157.83 23
27Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(3,20) ? - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 3 27.61 7.36 Within groups
75.00 20 3.75 Total 157.83 23
28Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hingesF(3,20)3.10 4.94 - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 3 27.61 7.36 Within groups
75.00 20 3.75 Total 157.83 23
29Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hingesF(3,20)3.10 4.94 - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed statistic to critical value.
Source SS df s2 Fobt p Between Groups
82.83 3 27.61 7.36 lt.01 Within
groups 75.00 20 3.75 Total 157.83 2
3
30Decision-making steps
- 1. Define problem Is spelling ability related to
grade level? - 2. Define hypotheses
- H0 m2 m4 m6 m8
- HA m2 m4 m6 m8
- 3. Define experiment 24 kids, 6 in each of 4
grades, given spelling test. - 4. Define statistic F test One factor,
independent samples ANOVA - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hingesF(3,20)3.10 4.94 - 7. Collect data X2 12.17, X4 10.00, X6
7.33, X8 8.17 - 8. Compare observed to critical value Fobs gt
Fcrit at p lt .05 and p lt.01 - 9. Decide Reject H0
- 10. Conclusion Spelling is related to grade
level in this school district.
31Hypothesis testing with two or more means
- Why a new statistical test? To maintain p(Type I
error) lt .05 when testing more than two means? - Analysis of variance theory
- F ratio BGvar/WGvar when HA is true, numerator
goes up, denominator stays the same. - Breakdown of sums of squares and degrees of
freedom - Hypothesis testing with ANOVA
- Independent samples ANOVA table
- Dependent samples ANOVA table
- Which means are different from which Tukeys
HSD.
32Breakdown of SS and df for dependent ANOVA
Generally
SST
dfT
SSBS
SSWS
dfBS
dfWS
SSTreat
SSerror
dfTreat
dferror
Same subjects in different conditions.
33Breakdown of SS and df for dependent ANOVA
Generally
SST
dfT kn-1
SSBS
SSWS
dfBS n-1
dfWS n(k-1)
SSTreat
SSerror
dfTreat k-1
dferror (n-1)(k-1)
Observed F ratio TREATvar/ERRORvar
(SSTreat/dfTreat)/(SSerror/dferror)
Same subjects in different conditions.
34Sample test problem dependent ANOVA
- A psychologist is interested in the effects that
observers have in an individuals problem solving
ability. He recruits 10 subjects and asks them
to solve some problems under three different
conditions alone, with one observer, and with 10
observers. The problems to be solved were of
equal difficulty and the order of conditions was
randomized across subjects. The DV was the
percent of correctly solved problems. Help this
psychologist determine if the number of observers
influences problem solving ability.
Source SS df s2 Fobt p Between
subjects 1036.67 9 Within
subjects 4250.00 20 Treatment 2381.67
2 1190.83 11.47 error 1868.33
18 103.80 Total 5286.67 29
Xalone 89, Xone 81.5, Xten 67.5
35Decision-making steps
- 1. Define Does the number of observers
influence prob. solving? - 2. Define hypotheses
- H0
- HA
- 3. Define experiment 10 subjects solve
problems,3 conditions alone, - with one observer, and with ten
observers. - 4. Define statistic
- 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges - 7. Perform experiment/collect data Xalone 89,
Xone 81.5, Xten 67.5 - 8. Compare observed to critical value.
- 9. Decide
- 10. Conclusion
36Decision-making steps
- 1. Define Does the number of observers influence
prob. solving? - 2. Define hypotheses
- H0 malone mone mten
- HA malone mone mten
- 3. Define experiment 10 subjects solve
problems,3 conditions alone, - with one observer, and with ten
observers. - 4. Define statistic F test One factor,
dependent samples ANOVA. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(2,18) 3.55 6.01 - 7. Perform experiment/collect data Xalone 89,
Xone 81.5, Xten 67.5 - 8. Compare observed to critical value.
- 9. Decide
- 10. Conclusion
37Decision-making steps
- 1. Define Does the number of observers influence
prob. solving? - 2. Define hypotheses
- H0 malone mone mten
- HA malone mone mten
- 3. Define experiment 10 subjects solve problems,
3 conditions alone, - with one observer, and with ten
observers. - 4. Define statistic F test One factor,
dependent samples ANOVA. - 5. Define acceptable probability of Type I error
a lt .05. - 6. Define value of statistic upon which decision
hinges F(2,18) 3.55 6.01 - 7. Perform experiment/collect data Xalone 89,
Xone 81.5, Xten 67.5 - 8. Compare observed to critical value. Fobs gt
Fcrit at p lt .05 and p lt.01 - 9. Decide Reject H0
- 10. Conclusion The number of observers does
influence problem solving.
38Hypothesis testing with two or more means
- Why a new statistical test? To maintain p(Type I
error) lt .05 when testing more than two means? - Analysis of variance theory
- F ratio BGvar/WGvar when HA is true, numerator
goes up, denominator stays the same. - Breakdown of sums of squares and degrees of
freedom - Hypothesis testing with ANOVA
- Independent samples ANOVA table
- Dependent samples ANOVA table
- Which means are different from which Tukeys
HSD.
39Which means are different from which?
- A significant ANOVA (reject H0) tells you that
there at least one mean differs from at least one
other mean. - Does NOT tell you which mean is different from
which. - Tukeys HSD A method of determining where the
differences are while maintaining p(Type I) lt .05 - Tukeys HSD requires
- Denominator of F ratio (either sW2 or s2error,
aka MSW or MSerror ) - n (number of subjects in each group/condition)
- Qobt for each pair of means
- Xlarge Xsmall or Xlarge Xsmall
- sW2 /n s2error /n
- Qcrit from Table G using dfW (or dferror), a, and
k ( grps or conditions)
40Procedure for Problem solving/Observer example.
- Denominator of F ratio (s2error, aka MSerror )
103.80 - n (number of subjects in each group/condition)
10 - Qobt for each pair of means Xalone 89, Xone
81.5, Xten 67.5 - Xlarge Xsmall
- s2error /n
- 89 81.5 89 67.5 81.5 67.5
- 103.80/10 103.80/10 103.80/10
- 2.33 6.67 4.35
- Qcrit from Table G using dferror(18), a (.05),
and k (3)
and
and
Qobt
and
Qobt
and
41Procedure for Problem solving/Observer example.
- Qobt for each pair of means Xalone 89, Xone
81.5, Xten 67.5 - 89 81.5 89 67.5 81.5 67.5
- 103.80/10 103.80/10 103.80/10
- 2.33 6.67 4.35
- Qcrit from Table G 2.97
- Qobt must be greater than Qcrit
- Thus the condition where there were 10 observers
differed from the condition when subjects were
alone and had only one observer, but the
condition where subjects were alone did not
differ from that where there was only one
observer.
and
Qobt
and
and
and
Qobt
42Procedure for spelling/Grade example.
- Denominator of F ratio (sW2, aka MSW ) 3.75
- n (number of subjects in each group/condition)
6 - Qobt for each pair of means
- X2 12.17, X4 10.00, X6 7.33, X8 8.17
- Xlarge Xsmall
- s2error /n
- 12.17 10.00 12.17 7.33 12.17 8.17
- 3.75/6 3.75/6 3.75/6
- 10.00 7.33 10.00 8.17 8.17 7.33
- 3.75/6 3.75/6 3.75/6
- Qcrit from Table G using dferror(20), a (.05),
and k (4) 3.96
and
and
Qobt
Qobt
and
and