Experimental Statistics week 5 - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Experimental Statistics week 5

Description:

For this analysis, 5 gasoline types (A - E) were to be tested. Twenty cars ... Thus, in the analysis, each gasoline type was tested on 4 cars. ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 73
Provided by: gatew231
Learn more at: https://faculty.smu.edu
Category:

less

Transcript and Presenter's Notes

Title: Experimental Statistics week 5


1
Experimental Statistics - week 5
Chapter 9 Multiple Comparisons Chapter 15
Randomized Complete Block Design
(15.3)
2
PC SAS on Campus
Library BIC Student Center
SAS Learning Edition 125
http//support.sas.com/rnd/le/index.html
3
1-Factor ANOVA Model
yij mi eij
mean for ith treatment
unexplained part
4
1-Factor ANOVA Model
yij mi eij
or
observed data
mean for ith treatment
unexplained part
5
1-Factor ANOVA Model
yij mi eij
or
yij m ai eij
observed data
mean for ith treatment
unexplained part
6
1-Factor ANOVA Model
yij mi eij
or
yij m ai eij
observed data
7
1-Factor ANOVA Model
yij mi eij
or
yij m ai eij
mean for ith treatment
8
1-Factor ANOVA Model
yij mi eij
or
yij m ai eij
unexplained part
9
were rewritten as
10
In words TSS(total SS) total sample
variability among yij values SSB(SS between)
variability explained by
differences in group means SSW(SS
within) unexplained variability
(within groups)
11
Analysis of Variance Table
Note unequal sample sizes allowed
12
CAR DATA Example For this analysis, 5 gasoline
types (A - E) were to be tested. Twenty
carswere selected for testing and were assigned
randomly to the groups (i.e. the gasoline types).
Thus, in the analysis, each gasoline type was
tested on 4 cars. A performance-based octane
reading was obtained for each car,and the
question is whether the gasolines differ with
respect to this octanereading.  
  A 91.7 91.2 90.9 90.6
B 91.7 91.9 90.9 90.9
C 92.4 91.2 91.6 91.0
D 91.8 92.2 92.0 91.4
E 93.1 92.9 92.4 92.4
13
Problem 1. Descriptive Statistics for CAR Data
  The
MEANS Procedure  
Analysis Variable octane  
Mean Std Dev Minimum
Maximum

91.7100000 0.7062876 90.6000000
93.1000000

14
Problem 3. Descriptive Statistics by Gasoline
  ------------------------------------
gasA -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.1000000 0.4690416
90.6000000 91.7000000

  ------------------------------------
gasB -------------------------------------  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.3500000 0.5259911
90.9000000 91.9000000

  ------------------------------------
gasC -------------------------------------  
Analysis Variable octane
Mean Std Dev
Minimum Maximum

91.5500000 0.6191392
91.0000000 92.4000000

 ------------------------------------
gasD -------------------------------------
Analysis Variable octane  
Mean Std Dev
Minimum Maximum

91.8500000 0.3415650
91.4000000 92.2000000

------------------------------------
gasE -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

92.7000000 0.3559026
92.4000000 93.1000000

 
15
Gasoline Example - Completely Randomized Design
-- All 5 Gasolines  
The GLM Procedure   Dependent Variable
octane  
Sum of Source DF
Squares Mean Square F Value Pr gt F  
Model 4 6.10800000
1.52700000 6.80 0.0025   Error
15 3.37000000 0.22466667  
Corrected Total 19 9.47800000    
R-Square Coeff Var Root MSE
octane Mean   0.644440
0.516836 0.473990 91.71000     Source
DF Type I SS Mean
Square F Value Pr gt F   gas
4 6.10800000 1.52700000 6.80
0.0025
16
Problem 6. 1-factor ANOVA for first 3 GAS
Types   The GLM
Procedure   Dependent Variable octane  
Sum of Source
DF Squares Mean Square
F Value Pr gt F   Model 2
0.40666667 0.20333333 0.69
0.5248   Error 9
2.64000000 0.29333333   Corrected Total
11 3.04666667    
R-Square Coeff Var Root MSE octane
Mean   0.133479 0.592996
0.541603 91.33333     Source
DF Type I SS Mean Square F Value
Pr gt F   gas 2
0.40666667 0.20333333 0.69 0.5248
17
Problem 3. Descriptive Statistics by Gasoline
  ------------------------------------
gasA -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.1000000 0.4690416
90.6000000 91.7000000

  ------------------------------------
gasB -------------------------------------  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.3500000 0.5259911
90.9000000 91.9000000

  ------------------------------------
gasC -------------------------------------  
Analysis Variable octane
Mean Std Dev
Minimum Maximum

91.5500000 0.6191392
91.0000000 92.4000000

 ------------------------------------
gasD -------------------------------------
Analysis Variable octane  
Mean Std Dev
Minimum Maximum

91.8500000 0.3415650
91.4000000 92.2000000

------------------------------------
gasE -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

92.7000000 0.3559026
92.4000000 93.1000000

 
18
(No Transcript)
19
Question 1 Which gasolines are different?
Question 2 Why didnt we just do t-tests to
compare all combinations of gasolines?
i.e. compare A vs B A vs C . . . D vs
E
20
Simulation
i.e. using computer to generate data under
certain known conditions and observing the
outcomes
21
Setting
Normal population with m 20 and s 5
Simulation Experiment
Generate 2 samples of size n 10 from this
population and run t-test to compare sample means.
i.e test
Question
What do we expect to happen?
22
t-test procedure (a .05) Reject H0 if t gt
2.101
Simulation Results
1 21.6 4.0
t .235 so we do not reject H0
2 21.1 5.4
23
Now - suppose we obtain 10 samples and test
Simulation results
1 21.6 4.0 2 21.1 5.4 3 20.9
6.2 4 18.3 3.2 5 23.1 6.7 6
18.6 4.8 7 22.2 5.8 8 19.1
5.9 9 20.3 2.5 10 19.3 3.2
Note Comparing means 4 vs 5 we get t 2.33
What does this mean?
24
Suppose we run all possible t-tests at
significance level a .05 to compare 10 sample
means of size n 10 from this population
- it can be shown that there is a 63 chance
that at least one pair of means will be
declared significantly different from each
other
F-test in ANOVA controls overall significance
level.
25
Probability of finding at least 2 of k means
significantly different using multiple t-tests
at the a .05 level when all means are actually
equal.
k Prob. 2 .05 3 .13 4 .21
5 .29 10 .63 20 .92
26
Fishers Least Significant Difference (LSD)
Protected LSD Preceded by an F-test for overall
significance.
Only use the LSD if F is significant.
X
Unprotected Not preceded by an F-test (like
individual t-tests).
27
Gasoline Example - Completely Randomized Design
-- All 5 Gasolines  
The GLM Procedure   Dependent Variable
octane  
Sum of Source DF
Squares Mean Square F Value Pr gt F  
Model 4 6.10800000
1.52700000 6.80 0.0025   Error
15 3.37000000 0.22466667  
Corrected Total 19 9.47800000    
R-Square Coeff Var Root MSE
octane Mean   0.644440
0.516836 0.473990 91.71000     Source
DF Type I SS Mean
Square F Value Pr gt F   gas
4 6.10800000 1.52700000 6.80
0.0025
28
Problem 3. Descriptive Statistics by Gasoline
  ------------------------------------
gasA -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.1000000 0.4690416
90.6000000 91.7000000

  ------------------------------------
gasB -------------------------------------  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

91.3500000 0.5259911
90.9000000 91.9000000

  ------------------------------------
gasC -------------------------------------  
Analysis Variable octane
Mean Std Dev
Minimum Maximum

91.5500000 0.6191392
91.0000000 92.4000000

 ------------------------------------
gasD -------------------------------------
Analysis Variable octane  
Mean Std Dev
Minimum Maximum

91.8500000 0.3415650
91.4000000 92.2000000

------------------------------------
gasE -------------------------------------  
The MEANS Procedure  
Analysis Variable
octane   Mean Std Dev
Minimum Maximum

92.7000000 0.3559026
92.4000000 93.1000000

 
29
(No Transcript)
30
(No Transcript)
31
PROC GLM (or ANOVA) CLASS gas MODEL
octanegas TITLE 'Gasoline Example -
Completely Randomized Design' MEANS
gas/lsd RUN
32
Gasoline Example - Completely Randomized
Design   The GLM
Procedure   t Tests
(LSD) for octane   NOTE This test controls
the Type I comparisonwise error rate, not the
experimentwise error
rate.   Alpha
0.05 Error
Degrees of Freedom 15
Error Mean Square 0.224667
Critical Value of t
2.13145 Least Significant
Difference 0.7144   Means with the
same letter are not significantly different.  
t Grouping Mean N
gas   A
92.7000 4 E  
B 91.8500 4 D
B C B
91.5500 4 C C
B C B 91.3500
4 B C
C 91.1000 4 A
33
Gasoline Example - Completely Randomized
Design   The GLM
Procedure   t Tests
(LSD) for octane   NOTE This test controls
the Type I comparisonwise error rate, not the
experimentwise error
rate.   Alpha
0.05 Error
Degrees of Freedom 15
Error Mean Square 0.224667
Critical Value of t
2.13145 Least Significant
Difference 0.7144   Means with the
same letter are not significantly different.  
t Grouping Mean N
gas   A
92.7000 4 E  
B 91.8500 4 D
B C B
91.5500 4 C C
B C B 91.3500
4 B C
C 91.1000 4 A
34
Bonferroni Multiple Comparisons (BSD)
Number of Pairwise Comparisons
35
(No Transcript)
36
(No Transcript)
37
PROC GLM (or ANOVA) CLASS gas MODEL
octanegas TITLE 'Gasoline Example -
Completely Randomized Design' MEANS
gas/bon RUN
38
Gasoline Example - Completely Randomized Design
  The
GLM Procedure   Bonferroni
(Dunn) t Tests for octane   NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ.    Alpha
0.05
Error Degrees of Freedom 15
Error Mean Square
0.224667 Critical Value of t
3.28604 Minimum
Significant Difference 1.1014   
Means with the same letter are not significantly
different.     Bon Grouping
Mean N gas  
A 92.7000 4 E
A B A
91.8500 4 D B
B 91.5500 4
C B
B 91.3500 4 B
B B
91.1000 4 A
39
Gasoline Example - Completely Randomized Design
  The
GLM Procedure   Bonferroni
(Dunn) t Tests for octane   NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ.    Alpha
0.05
Error Degrees of Freedom 15
Error Mean Square
0.224667 Critical Value of t
3.28604 Minimum
Significant Difference 1.1014   
Means with the same letter are not significantly
different.     Bon Grouping
Mean N gas  
A 92.7000 4 E
A B A
91.8500 4 D B
B 91.5500 4
C B
B 91.3500 4 B
B B
91.1000 4 A
40
Extracted from From Ex. 8.2, page 390-391 3
Methods for Reducing Hostility 12 students
displaying similar hostility were randomly
assigned to 3 treatment methods. Scores (HLT) at
end of study recorded.
Method 1 96 79 91 85 Method 2 77 76
74 73 Method 3 66 73 69 66
Test
41
ANOVA Table Output - hostility data
- calculations done in
class     Source SS df
MS F
p-value   Between 767.17 2
383.58 16.7 lt.001  
samples Within 205.74
9 22.86   samples Totals
972.91  
42
(No Transcript)
43
Extracted from From Ex. 8.2, page 390-391 3
Methods for Reducing Hostility 12 students
displaying similar hostility were randomly
assigned to 3 treatment methods. Scores (HLT) at
end of study recorded.
Method 1 96 79 91 85 Method 2 77 76
74 73 Method 3 66 73 69 66
Test
44
(No Transcript)
45
Hostility Data - Completely
Randomized Design
 
The GLM Procedure  
t Tests (LSD) for score  
NOTE This test controls the Type I
comparisonwise error rate, not the experimentwise
error rate.   Alpha
0.05 Error
Degrees of Freedom 9
Error Mean Square 22.86111
Critical Value of t
2.26216 Least Significant
Difference 7.6482   Means with the same
letter are not significantly different.  
t Grouping Mean N
method   A 87.750
4 M1   B
75.000 4 M2 B
B 68.500 4 M3
46
Hostility Data - Completely Randomized Design
  The
GLM Procedure   Bonferroni
(Dunn) t Tests for score   NOTE This test
controls the Type I experimentwise error rate,
but it generally has a higher Type II error rate
than REGWQ.     Alpha
0.05
Error Degrees of Freedom 9
Error Mean Square
22.86111 Critical Value of t
2.93332 Minimum
Significant Difference 9.9173     Means
with the same letter are not significantly
different.    Bon Grouping
Mean N method   A
87.750 4 M1  
B 75.000 4 M2
B B 68.500
4 M3
47
Begin Thursday, February 10 Lecture
48
Some Multiple Comparison Techniquesin SAS
FISHERS LSD (LSD) BONFERRONI (BON)   STUDENT-NEW
MAN-KEULS (SNK) DUNCAN   DUNNETT    RYAN-EINOT-G
ABRIEL-WELCH (REGWQ)   SCHEFFE   TUKEY  
49
Balloon Data   Col. 1-2 - observation number
Col. 3 - color (1pink, 2yellow, 3orange,
4blue) Col. 4-7 - inflation time in seconds
1122.4 2324.6 3120.3 4419.8 5324.3 6222.2
7228.5 8225.7 9320.2 10119.6 11228.8 12424.0 134
17.1 14419.3 15324.2 16115.8 17218.3 18117.5 19418
.7 20322.9 21116.3 22414.0 23416.6 24218.1 25218.9
26416.0 27220.1 28322.5 29316.0 30119.3 31115.9 3
2320.3
50
  ANOVA --- Balloon
Data  
General Linear Models Procedure   Depende
nt Variable TIME
Sum of Mean Source
DF Squares Square F
Value Pr gt F   Model 3
126.15125000 42.05041667 3.85
0.0200   Error 28
305.64750000 10.91598214   Corrected Total
31 431.79875000  
R-Square C.V. Root MSE
TIME Mean   0.292153
16.31069 3.3039343
20.256250      
Mean Source
DF Type I SS Square F Value
Pr gt F   Color 3
126.15125000 42.05041667 3.85 0.0200
51
Experimental Design Concepts and Terminology
Designed Experiment
- an investigation in which a specified framework
is used to compare groups or treatments
Factors
- any feature of the experiment that can be
varied from trial to trial
- up to this point weve only looked at
experiments with a single factor
52
Treatments
- conditions constructed from the factors
(levels of the factor considered, etc.)
Experimental Units
- subjects, material, etc. to which treatment
factors are randomly assigned
- there is inherent variability among these
units irrespective of the treatment imposed
Replication
- we usually assign each treatment to several
experimental units
- these are called replicates
53
treatments experimental units replicates
Examples
Car Data
Hostility Data
Balloon Data
54
Scatterplot Using GPLOT
55
  Plot of timeid. Legend
A 1 obs, B 2 obs, etc.   time
30
A
A 28
26
A
A A
A 24 A

A A
A
A 22
A A
A
20 A
A A
A
A
A A
A
18
A
A
A
A
A 16
A A
A A
14
A

0
5 10 15 20 25
30 35  
id
Scatterplot Using PLOT
56
RECALL 1-Factor ANOVA Model
- random errors follow a Normal (N)
distribution, are independently distributed (ID),
and have zero mean and constant variance
-- i.e. variability does not change from
group to group
57
Model Assumptions
- equal variances - normality
Checking Validity of Assumptions
Equal Variances
1. F-test similar to 2-sample case - Hartleys
test (p.366 text) - not recommended 2.
Graphical - side-by-side box plots
58
Graphical Assessment of Equal Variance Assumption
59
Assessing Normality of Errors
yij m ai eij
so
eij yij - (m ai)
yij - mi
eij is estimated by
60
proc glm class color model timecolor
title 'ANOVA --- Balloon Data' output
outnew rresid means color/lsd run proc
univariate normal plot var resid title
'Normal Probability Plot for Residuals - Balloon
Data' run
61
Normal
Probability Plot 6.5








0.5





-5.5
----------------------------------------
-2 -1
0 1 2
62
Homework Problem using Balloon Data
- Run ANOVA using SAS -- Do not
use the 4-step procedure. Instead, describe
your findings based on the P-value.
- Run multiple comparisons (both Fishers LSD
and Bonferroni) -- by hand
-- using SAS for Balloon Data - Give
graphical assessment of the normality and equal
variance assumptions and discuss your
results
63
Model for Gasoline Data
yij mi eij
or
yij m ai eij
observed octane
unexplained part
mean for ith gasoline
-- car-to-car differences -- temperature -- etc.
64
Gasoline Data
Question What if car differences are obscuring
gasoline differences?
Similar to diet t-test example Recall
person-to-person differences
obscured effect of diet
65
Possible Alternative Design
Test all 5 gasolines on the same car
- in essence we test the gasoline effect
directly and remove effect of car-to-car
variation
Question
How would you randomize an experiment with 4 cars?
66
Blocking an Experiment
- dividing the observations into groups (called
blocks) where the observations in each block
are collected under relatively similar
conditions - comparisons can may times be made
more precisely this way
67
Terminology is based on Agricultural Experiments
Consider the problem of testing fertilizers on a
crop - t fertilizers - n observations
on each
68
Completely Randomized Design
B
A
C
A
B
B
A
C
A
C
C
B
C
A
t 3 fertilizers n 5 replications
B
Randomly selected 15 plots
69
Randomized Complete Block Strategy
A C B
B A C
C B A
A B C
t 3 fertilizers
C A B
- randomly select 5 blocks - randomly assign
the 3 treatments to each block
Note The 3 plots within each block are
similar - similar soil type, sun,
water, etc
70
Randomized Complete Block Design
Randomly assign each treatment once to every
block
Car Example Car 1 randomly assign each gas
to this car Car 2 .... etc.
Agricultural Example Randomly assign each
fertilizer to one of the 3 plots within each
block
71
Model For Randomized Complete Block Design
yij m ai bj eij
effect of ith treatment
effect of jth block
unexplained error
(car)
(gasoline)
72
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com