Title: Experimental Statistics week 5
1Experimental Statistics - week 5
Chapter 9 Multiple Comparisons Chapter 15
Randomized Complete Block Design
(15.3)
2PC SAS on Campus
Library BIC Student Center
SAS Learning Edition 125
http//support.sas.com/rnd/le/index.html
31-Factor ANOVA Model
yij mi eij
mean for ith treatment
unexplained part
41-Factor ANOVA Model
yij mi eij
or
observed data
mean for ith treatment
unexplained part
51-Factor ANOVA Model
yij mi eij
or
yij m ai eij
observed data
mean for ith treatment
unexplained part
61-Factor ANOVA Model
yij mi eij
or
yij m ai eij
observed data
71-Factor ANOVA Model
yij mi eij
or
yij m ai eij
mean for ith treatment
81-Factor ANOVA Model
yij mi eij
or
yij m ai eij
unexplained part
9were rewritten as
10In words TSS(total SS) total sample
variability among yij values SSB(SS between)
variability explained by
differences in group means SSW(SS
within) unexplained variability
(within groups)
11Analysis of Variance Table
Note unequal sample sizes allowed
12CAR DATA Example For this analysis, 5 gasoline
types (A - E) were to be tested. Twenty
carswere selected for testing and were assigned
randomly to the groups (i.e. the gasoline types).
Thus, in the analysis, each gasoline type was
tested on 4 cars. A performance-based octane
reading was obtained for each car,and the
question is whether the gasolines differ with
respect to this octanereading. Â
 A 91.7 91.2 90.9 90.6
B 91.7 91.9 90.9 90.9
C 92.4 91.2 91.6 91.0
D 91.8 92.2 92.0 91.4
E 93.1 92.9 92.4 92.4
13Problem 1. Descriptive Statistics for CAR Data
 The
MEANS Procedure Â
Analysis Variable octane Â
Mean Std Dev Minimum
Maximum
91.7100000 0.7062876 90.6000000
93.1000000
14Problem 3. Descriptive Statistics by Gasoline
 ------------------------------------
gasA ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.1000000 0.4690416
90.6000000 91.7000000
  ------------------------------------
gasB ------------------------------------- Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.3500000 0.5259911
90.9000000 91.9000000
  ------------------------------------
gasC ------------------------------------- Â
Analysis Variable octane
Mean Std Dev
Minimum Maximum
91.5500000 0.6191392
91.0000000 92.4000000
 ------------------------------------
gasD -------------------------------------
Analysis Variable octane Â
Mean Std Dev
Minimum Maximum
91.8500000 0.3415650
91.4000000 92.2000000
------------------------------------
gasE ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
92.7000000 0.3559026
92.4000000 93.1000000
Â
15Gasoline Example - Completely Randomized Design
-- All 5 Gasolines Â
The GLM Procedure  Dependent Variable
octane Â
Sum of Source DF
Squares Mean Square F Value Pr gt F Â
Model 4 6.10800000
1.52700000 6.80 0.0025 Â Error
15 3.37000000 0.22466667 Â
Corrected Total 19 9.47800000 Â Â
R-Square Coeff Var Root MSE
octane Mean  0.644440
0.516836 0.473990 91.71000 Â Â Source
DF Type I SS Mean
Square F Value Pr gt F Â gas
4 6.10800000 1.52700000 6.80
0.0025
16Problem 6. 1-factor ANOVA for first 3 GAS
Types  The GLM
Procedure  Dependent Variable octane Â
Sum of Source
DF Squares Mean Square
F Value Pr gt F Â Model 2
0.40666667 0.20333333 0.69
0.5248 Â Error 9
2.64000000 0.29333333 Â Corrected Total
11 3.04666667 Â Â
R-Square Coeff Var Root MSE octane
Mean  0.133479 0.592996
0.541603 91.33333 Â Â Source
DF Type I SS Mean Square F Value
Pr gt F Â gas 2
0.40666667 0.20333333 0.69 0.5248
17Problem 3. Descriptive Statistics by Gasoline
 ------------------------------------
gasA ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.1000000 0.4690416
90.6000000 91.7000000
  ------------------------------------
gasB ------------------------------------- Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.3500000 0.5259911
90.9000000 91.9000000
  ------------------------------------
gasC ------------------------------------- Â
Analysis Variable octane
Mean Std Dev
Minimum Maximum
91.5500000 0.6191392
91.0000000 92.4000000
 ------------------------------------
gasD -------------------------------------
Analysis Variable octane Â
Mean Std Dev
Minimum Maximum
91.8500000 0.3415650
91.4000000 92.2000000
------------------------------------
gasE ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
92.7000000 0.3559026
92.4000000 93.1000000
Â
18(No Transcript)
19Question 1 Which gasolines are different?
Question 2 Why didnt we just do t-tests to
compare all combinations of gasolines?
i.e. compare A vs B A vs C . . . D vs
E
20Simulation
i.e. using computer to generate data under
certain known conditions and observing the
outcomes
21Setting
Normal population with m 20 and s 5
Simulation Experiment
Generate 2 samples of size n 10 from this
population and run t-test to compare sample means.
i.e test
Question
What do we expect to happen?
22t-test procedure (a .05) Reject H0 if t gt
2.101
Simulation Results
1 21.6 4.0
t .235 so we do not reject H0
2 21.1 5.4
23Now - suppose we obtain 10 samples and test
Simulation results
1 21.6 4.0 2 21.1 5.4 3 20.9
6.2 4 18.3 3.2 5 23.1 6.7 6
18.6 4.8 7 22.2 5.8 8 19.1
5.9 9 20.3 2.5 10 19.3 3.2
Note Comparing means 4 vs 5 we get t 2.33
What does this mean?
24Suppose we run all possible t-tests at
significance level a .05 to compare 10 sample
means of size n 10 from this population
- it can be shown that there is a 63 chance
that at least one pair of means will be
declared significantly different from each
other
F-test in ANOVA controls overall significance
level.
25Probability of finding at least 2 of k means
significantly different using multiple t-tests
at the a .05 level when all means are actually
equal.
k Prob. 2 .05 3 .13 4 .21
5 .29 10 .63 20 .92
26Fishers Least Significant Difference (LSD)
Protected LSD Preceded by an F-test for overall
significance.
Only use the LSD if F is significant.
X
Unprotected Not preceded by an F-test (like
individual t-tests).
27Gasoline Example - Completely Randomized Design
-- All 5 Gasolines Â
The GLM Procedure  Dependent Variable
octane Â
Sum of Source DF
Squares Mean Square F Value Pr gt F Â
Model 4 6.10800000
1.52700000 6.80 0.0025 Â Error
15 3.37000000 0.22466667 Â
Corrected Total 19 9.47800000 Â Â
R-Square Coeff Var Root MSE
octane Mean  0.644440
0.516836 0.473990 91.71000 Â Â Source
DF Type I SS Mean
Square F Value Pr gt F Â gas
4 6.10800000 1.52700000 6.80
0.0025
28Problem 3. Descriptive Statistics by Gasoline
 ------------------------------------
gasA ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.1000000 0.4690416
90.6000000 91.7000000
  ------------------------------------
gasB ------------------------------------- Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
91.3500000 0.5259911
90.9000000 91.9000000
  ------------------------------------
gasC ------------------------------------- Â
Analysis Variable octane
Mean Std Dev
Minimum Maximum
91.5500000 0.6191392
91.0000000 92.4000000
 ------------------------------------
gasD -------------------------------------
Analysis Variable octane Â
Mean Std Dev
Minimum Maximum
91.8500000 0.3415650
91.4000000 92.2000000
------------------------------------
gasE ------------------------------------- Â
The MEANS Procedure Â
Analysis Variable
octane  Mean Std Dev
Minimum Maximum
92.7000000 0.3559026
92.4000000 93.1000000
Â
29(No Transcript)
30(No Transcript)
31PROC GLM (or ANOVA) CLASS gas MODEL
octanegas TITLE 'Gasoline Example -
Completely Randomized Design' MEANS
gas/lsd RUN
32Gasoline Example - Completely Randomized
Design  The GLM
Procedure  t Tests
(LSD) for octane  NOTE This test controls
the Type I comparisonwise error rate, not the
experimentwise error
rate. Â Alpha
0.05 Error
Degrees of Freedom 15
Error Mean Square 0.224667
Critical Value of t
2.13145 Least Significant
Difference 0.7144 Â Means with the
same letter are not significantly different. Â
t Grouping Mean N
gas  A
92.7000 4 E Â
B 91.8500 4 D
B C B
91.5500 4 C C
B C B 91.3500
4 B C
C 91.1000 4 A
33Gasoline Example - Completely Randomized
Design  The GLM
Procedure  t Tests
(LSD) for octane  NOTE This test controls
the Type I comparisonwise error rate, not the
experimentwise error
rate. Â Alpha
0.05 Error
Degrees of Freedom 15
Error Mean Square 0.224667
Critical Value of t
2.13145 Least Significant
Difference 0.7144 Â Means with the
same letter are not significantly different. Â
t Grouping Mean N
gas  A
92.7000 4 E Â
B 91.8500 4 D
B C B
91.5500 4 C C
B C B 91.3500
4 B C
C 91.1000 4 A
34Bonferroni Multiple Comparisons (BSD)
Number of Pairwise Comparisons
35(No Transcript)
36(No Transcript)
37PROC GLM (or ANOVA) CLASS gas MODEL
octanegas TITLE 'Gasoline Example -
Completely Randomized Design' MEANS
gas/bon RUN
38Gasoline Example - Completely Randomized Design
 The
GLM Procedure  Bonferroni
(Dunn) t Tests for octane  NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ. Â Â Alpha
0.05
Error Degrees of Freedom 15
Error Mean Square
0.224667 Critical Value of t
3.28604 Minimum
Significant Difference 1.1014 Â Â
Means with the same letter are not significantly
different. Â Â Bon Grouping
Mean N gas Â
A 92.7000 4 E
A B A
91.8500 4 D B
B 91.5500 4
C B
B 91.3500 4 B
B B
91.1000 4 A
39Gasoline Example - Completely Randomized Design
 The
GLM Procedure  Bonferroni
(Dunn) t Tests for octane  NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ. Â Â Alpha
0.05
Error Degrees of Freedom 15
Error Mean Square
0.224667 Critical Value of t
3.28604 Minimum
Significant Difference 1.1014 Â Â
Means with the same letter are not significantly
different. Â Â Bon Grouping
Mean N gas Â
A 92.7000 4 E
A B A
91.8500 4 D B
B 91.5500 4
C B
B 91.3500 4 B
B B
91.1000 4 A
40Extracted from From Ex. 8.2, page 390-391 3
Methods for Reducing Hostility 12 students
displaying similar hostility were randomly
assigned to 3 treatment methods. Scores (HLT) at
end of study recorded.
Method 1 96 79 91 85 Method 2 77 76
74 73 Method 3 66 73 69 66
Test
41ANOVA Table Output - hostility data
- calculations done in
class   Source SS df
MS F
p-value  Between 767.17 2
383.58 16.7 lt.001 Â
samples Within 205.74
9 22.86 Â samples Totals
972.91 Â
42(No Transcript)
43Extracted from From Ex. 8.2, page 390-391 3
Methods for Reducing Hostility 12 students
displaying similar hostility were randomly
assigned to 3 treatment methods. Scores (HLT) at
end of study recorded.
Method 1 96 79 91 85 Method 2 77 76
74 73 Method 3 66 73 69 66
Test
44(No Transcript)
45 Hostility Data - Completely
Randomized Design
Â
The GLM Procedure Â
t Tests (LSD) for score Â
NOTE This test controls the Type I
comparisonwise error rate, not the experimentwise
error rate. Â Alpha
0.05 Error
Degrees of Freedom 9
Error Mean Square 22.86111
Critical Value of t
2.26216 Least Significant
Difference 7.6482 Â Means with the same
letter are not significantly different. Â
t Grouping Mean N
method  A 87.750
4 M1 Â B
75.000 4 M2 B
B 68.500 4 M3
46Hostility Data - Completely Randomized Design
 The
GLM Procedure  Bonferroni
(Dunn) t Tests for score  NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ. Â Â Alpha
0.05
Error Degrees of Freedom 9
Error Mean Square
22.86111 Critical Value of t
2.93332 Minimum
Significant Difference 9.9173 Â Â Means
with the same letter are not significantly
different. Â Â Bon Grouping
Mean N method  A
87.750 4 M1 Â
B 75.000 4 M2
B B 68.500
4 M3
47Begin Thursday, February 10 Lecture
48Some Multiple Comparison Techniquesin SAS
FISHERS LSD (LSD) BONFERRONI (BON) Â STUDENT-NEW
MAN-KEULS (SNK) DUNCAN Â DUNNETT Â Â RYAN-EINOT-G
ABRIEL-WELCH (REGWQ) Â SCHEFFE Â TUKEY Â
49Balloon Data  Col. 1-2 - observation number
Col. 3 - color (1pink, 2yellow, 3orange,
4blue) Col. 4-7 - inflation time in seconds
1122.4 2324.6 3120.3 4419.8 5324.3 6222.2
7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 134
17.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418
.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9
26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 3
2320.3
50Â ANOVA --- Balloon
Data Â
General Linear Models Procedure  Depende
nt Variable TIME
Sum of Mean Source
DF Squares Square F
Value Pr gt F Â Model 3
126.15125000 42.05041667 3.85
0.0200 Â Error 28
305.64750000 10.91598214 Â Corrected Total
31 431.79875000 Â
R-Square C.V. Root MSE
TIME Mean  0.292153
16.31069 3.3039343
20.256250 Â Â Â
Mean Source
DF Type I SS Square F Value
Pr gt F Â Color 3
126.15125000 42.05041667 3.85 0.0200
51Experimental Design Concepts and Terminology
Designed Experiment
- an investigation in which a specified framework
is used to compare groups or treatments
Factors
- any feature of the experiment that can be
varied from trial to trial
- up to this point weve only looked at
experiments with a single factor
52Treatments
- conditions constructed from the factors
(levels of the factor considered, etc.)
Experimental Units
- subjects, material, etc. to which treatment
factors are randomly assigned
- there is inherent variability among these
units irrespective of the treatment imposed
Replication
- we usually assign each treatment to several
experimental units
- these are called replicates
53treatments experimental units replicates
Examples
Car Data
Hostility Data
Balloon Data
54 Scatterplot Using GPLOT
55Â Plot of timeid. Legend
A 1 obs, B 2 obs, etc. Â time
30
A
A 28
26
A
A A
A 24 A
A A
A
A 22
A A
A
20 A
A A
A
A
A A
A
18
A
A
A
A
A 16
A A
A A
14
A
0
5 10 15 20 25
30 35 Â
id
Scatterplot Using PLOT
56RECALL 1-Factor ANOVA Model
- random errors follow a Normal (N)
distribution, are independently distributed (ID),
and have zero mean and constant variance
-- i.e. variability does not change from
group to group
57Model Assumptions
- equal variances - normality
Checking Validity of Assumptions
Equal Variances
1. F-test similar to 2-sample case - Hartleys
test (p.366 text) - not recommended 2.
Graphical - side-by-side box plots
58Graphical Assessment of Equal Variance Assumption
59Assessing Normality of Errors
yij m ai eij
so
eij yij - (m ai)
yij - mi
eij is estimated by
60proc glm class color model timecolor
title 'ANOVA --- Balloon Data' output
outnew rresid means color/lsd run proc
univariate normal plot var resid title
'Normal Probability Plot for Residuals - Balloon
Data' run
61 Normal
Probability Plot 6.5
0.5
-5.5
----------------------------------------
-2 -1
0 1 2
62 Homework Problem using Balloon Data
- Run ANOVA using SAS -- Do not
use the 4-step procedure. Instead, describe
your findings based on the P-value.
- Run multiple comparisons (both Fishers LSD
and Bonferroni) -- by hand
-- using SAS for Balloon Data - Give
graphical assessment of the normality and equal
variance assumptions and discuss your
results
63Model for Gasoline Data
yij mi eij
or
yij m ai eij
observed octane
unexplained part
mean for ith gasoline
-- car-to-car differences -- temperature -- etc.
64Gasoline Data
Question What if car differences are obscuring
gasoline differences?
Similar to diet t-test example Recall
person-to-person differences
obscured effect of diet
65Possible Alternative Design
Test all 5 gasolines on the same car
- in essence we test the gasoline effect
directly and remove effect of car-to-car
variation
Question
How would you randomize an experiment with 4 cars?
66Blocking an Experiment
- dividing the observations into groups (called
blocks) where the observations in each block
are collected under relatively similar
conditions - comparisons can may times be made
more precisely this way
67Terminology is based on Agricultural Experiments
Consider the problem of testing fertilizers on a
crop - t fertilizers - n observations
on each
68Completely Randomized Design
B
A
C
A
B
B
A
C
A
C
C
B
C
A
t 3 fertilizers n 5 replications
B
Randomly selected 15 plots
69Randomized Complete Block Strategy
A C B
B A C
C B A
A B C
t 3 fertilizers
C A B
- randomly select 5 blocks - randomly assign
the 3 treatments to each block
Note The 3 plots within each block are
similar - similar soil type, sun,
water, etc
70Randomized Complete Block Design
Randomly assign each treatment once to every
block
Car Example Car 1 randomly assign each gas
to this car Car 2 .... etc.
Agricultural Example Randomly assign each
fertilizer to one of the 3 plots within each
block
71Model For Randomized Complete Block Design
yij m ai bj eij
effect of ith treatment
effect of jth block
unexplained error
(car)
(gasoline)
72(No Transcript)