Biostatisics and Computer Applications

About This Presentation

Title:

Biostatisics and Computer Applications

Description:

1. Biostatisics and Computer Applications. ANOVA of ... Also we test the effect of clipping. ... Test warming and clipping effect. Split plot experiment. 37 ... – PowerPoint PPT presentation

Number of Views:45

Avg rating:3.0/5.0

Slides: 41

Provided by: dafen

Category:

more less

Transcript and Presenter's Notes

Title: Biostatisics and Computer Applications

1
Biostatisics and Computer Applications

ANOVA of hierarchical data
Experimental design
ANOVA of common designs
SAS programming
1/6/2003

2
Recap (Analysis of Variance)

Analysis of variance
One-way ANOVA
Two-way ANOVA
Multiple comparisons

3
Recap (Data for One-Way ANOVA)
K independent samples, n observations, kn total
4
Recap( One-way ANOVA)
5
Recap (Two-way ANOVA data)
6
Recap (two-way ANOVA)
7
Recap (multiple comparisons)

PLSD (LSD, t test) method.
Confidence interval (1-alpha)

8
Hierarchical (Nested classification) data

If the experimental data have l groups, each
group has u subgroups, each subgroup has v
sub-subgroup, , each of the last sub-sub-group
has n observations, we call this data
hierarchical data (Nested classification).
The simplest one is 2 levels hierarchic data. It
contains l group, each group has m subgroup, each
subgroup has n observation. Total number of
observations is lmn.
Example College-gtyear-gtmajor-gtstudent

9
Hierarchical data table (i1,2,,lj1,2,,mk1,2
,,n)
10
Linear mathematic model for hierarchical data
11
Linear mathematic model for hierarchical ANOVA
12
ANOVA Total Variation Partitioning
Total Variation
SS(Total)
Variation Due to Group
Variation Due to subgroup within group

SSd
Variation Due to Random Sampling

13
ANOVA Summary Table
Source of
Degrees of
Sum of
Mean
F
Variation
Freedom
Squares
Square
l - 1
SSt
MSt
MSt
Group
MSd
Subgroup within group
l(m-1)
SSd
MSd
MSd
MSe
SSe
lm(n-1)
MSe
Error
Total
lmn - 1
SST
14
Expected mean square of hierarchical ANOVA
Source of
Degrees of
Mean
Expected Mean
Variation
Freedom
Square
Square
l - 1
MSt
Group
Subgroup within group
l(m-1)
MSd
MSe
lm(n-1)
Error
Total
lmn - 1
15
ANOVA Null Hypotheses

1. No difference in means due to group
H01 ?1 ?2... ?k
2. No difference in means due to subgroup within
group
H02 ?12 ?12 ... ?lm
If H02 is accepted, test H01,

16
Example of hierarchical ANOVA
Measured lead concentrations in 4 vegetables
after the soil was supplied with a pesticide.
Each vegetable was planted 3 pots with
contaminated soil. There were 5 plants per pot.
17
Result of ANOVA Table
Source of
Degrees of
Sum of
Mean
F
Variation
Freedom
Squares
Square
3
76.74
25.58
405.80
Plant
Pot within plant
8
0.63
0.078
1.31
2.90
48
0.060
Error
59
Total
80.27
18
Multiple comparison

Tukey fixed range method
K4,df60(56), q0.053.74 (table)

19
SAS program

DATA Hierarchic
INPUT vegetable pot _at_
DO k1 to 5
INPUT concentration _at_
OUTPUT
END
DATALINES
A 1 0.7 0.6 0.9 0.5 0.6
A 2 0.9 0.9 0.7 1.1 0.7
A 3 0.8 0.6 0.9 1.0 0.8
B 1 1.2 1.4 1.6 1.2 1.5
B 2 1.1 0.9 1.3 1.2 1.0
B 3 1.5 1.4 0.9 1.3 1.6
C 1 0.6 0.6 0.8 0.9 0.7
C 2 0.5 0.8 0.9 1.0 0.6
C 3 0.6 1.2 0.8 0.9 1.0
D 1 4.2 3.7 2.9 3.5 3.6
D 2 2.9 3.5 3.8 3.1 3.5
D 3 3.6 3.5 4.0 3.3 3.7

PROC ANOVA
CLASS vegetable pot
MODEL concentrationvegetable pot(vegetable)
TEST Hvegetable Epot(vegetable)
MEANS vegetable/Tukey Epot(vegetable)
RUN

We test vegetation effects using MS of pot.
We use two INPUT statements.
20
Experimental design

Experimental design is a planned interference in
the natural order of events by the researcher.
Why design?
inferences about what produced, contributed to,
or caused events
gain such information without ambiguity.

21
Experimental design

Terminology
Experimental (and environmental) factor A
variable of specific experimental interest. For
example, fertilizer type amount of nutrient.
Treatment.
Experiment different level of an experimental
factor or combination of levels in a multiple
factors. Level refers to the degree or intensity
of a factor.
Random refers to the property of completely
chance events that are not predictable.
Elimination of systematic influence upon
assignment.
Control refers to a group not being exposed to
the treatment.
Block refers to categories of subjects with a
treatment group. Within a block, environmental
factors are homogeneity.

22
Experimental design

Principles of experimental design
1. Randomization.
Assign treatments to each unit (plot) randomly
(with same probability).
Provides unbiased estimate of error (normal
distribution)
2. Replication.
Estimate random error ( )
Increase the precision of the estimation
.
3. Block control.
One set of experiments with similar environmental
conditions
Further decrease standard error by separating
block effect.

A,B,C,D,E
23
Experimental design

According to number of factors
Single factor experiment
Detect simple effect of experimental factor
Easy to apply and analyze.
Multiple factors experiment
2, 3 factors and more
Detect both main effect and interaction
Lower standard error, easy to find smaller true
effects
Difficult to analyze data.

24
Experimental effects

Experimental effects
Simple effect change in response produced by a
change in the level of a factor
Main effect mean of simple effect
Interaction change in response caused by the
interaction of experimental factors.
Example
Test the effects of N and P on wheat yield. Two
levels for N (n1,n2) and two level for P (p1,p2).
Yields (kg/plot) are shown in table.

25
Experimental effects

No interaction!
Simple effect Main effect Interaction
26
Experimental effects

Positive interaction!
Simple effect Main effect Interaction
27
Experimental effects

Negative interaction!
Simple effect Main effect Interaction
28
Experimental design

According to unit arrangement
Completely randomized experiment
Random, replicate One or multiple factors
Easy to apply and analyze
Randomized block experiment
Random, replicate and block control
One or multiple factors
Most commonly used
Latin square experiment
Random, replicate and block control on row and
column
One or more factors, but treatment No k510
Split plot experiment
Special requirement for different factors
Different precisions for factors
Multiple factors only.

29
Completely randomized experiment

One factor experiment
This is exactly the same as one-way ANOVA.
Multiple factors experiment
Similar to randomized block experiment, just
remove Block effect as shown next.

30
Randomized block experiment (one factor)

Example We want to compare the yield of 7 barley
varieties. Randomized block design, replicate 3
times. Plots and yields per plot show below.

DATALINES I F 20 I A 24 I E 22
DATA rbe1 INPUT block variety yield
31
Randomized block experiment (one factor)

DATA rbe1
INPUT block variety yield
datalines
I F 20
I A 24
I E 22
I D 18
I C 21
I G 20
I B 20
II A 20
II D 16
II C 19
II F 21
II B 19
II E 20
II G 19
III F 21
III B 21

PROC ANOVA
CLASS block variety
MODEL yieldvariety block
MEANS variety /LSD alpha0.05
RUN

Here we are interested in the effect of variety,
not block. Block is used to decrease the standard
error. If block effect is not significant, it
means no big difference in environmental factors
among blocks. If block effect is significant, we
are happy we separated this effect from model
error. We do not do multiple comparisons for
block.
32
Randomized block experiment (two factors)

Example We want to test the N and P effects on
plant yield. Three levels for N (0, 5, 10 kg) and
five levels for P (0,2,4,6,8 kg), the total
treatments is 15. Randomized block design,
replicate twice.

DATA rbe2
INPUT BLOCK N 5-6 P 7-8 yield
DATALINES
1 A2B2 5.0
1 A2B4 4.9
1 A1B1 4.3
1 A3B2 4.4
1 A1B5 4.7
1 A2B1 5.2

We use column input to read in N and P.
33
Randomized block experiment (two factors)

DATA rbe2
INPUT BLOCK N 5-6 P 7-8 yield
DATALINES
1 A2B2 5.0
1 A2B4 4.9
1 A1B1 4.3
1 A3B2 4.4
1 A1B5 4.7
1 A2B1 5.2
1 A3B4 3.4
1 A1B4 4.8
1 A3B5 3.7
2 A2B3 3.4
2 A3B1 4.7
2 A3B3 3.4
2 A2B2 5.2
2 A1B4 4.0
2 A3B5 4.2

PROC ANOVA
CLASS block n p
MODEL yieldn p np block
MEANS n p np /t
RUN

If the interaction (np) is not significant, then
the best combination of N and P is highest N
treatment and highest P treatment. Otherwise, you
need to compare NP.
34
Latin square experiment

Five N treatment, (0kg, 10kg, 15kg, 20kg, 25kg)
on wheat yield. Latin square design. (Code for
treatment 1-0kg, 2-10kg, 3-15 kg, 4-20kg, 5-25
kg).

35
Latin square design

PROC ANOVA
CLASS row column treatment
MODEL yieldtreatment row column
MEANS treatment /t alpha0.05
MEANS treatment /t alpha0.01
RUN

DATA latin
DO row1 to 5
DO column1 to 5
INPUT treatment yield _at__at_
OUTPUT
END
END
DATALINES
3 10.1 1 7.9 2 9.8 5 7.1 4 9.6
1 7.0 4 10.0 5 7.0 3 9.7 2 9.1
5 7.6 3 9.7 4 10.0 2 9.3 1 6.8
4 10.5 2 9.6 3 9.8 1 6.6 5 7.9
2 8.9 5 8.9 1 8.6 4 10.6 3 10.1

We focus on treatment effect only.
36
Split plot experiment

To test the warming effect on plant growth, we
set four level of increased temperature (A1 3o,
A2 2o A31o and A40o, control.). Also we test
the effect of clipping. Within each warming plot,
we set 3 levels for clipping (B1 clipping twice,
summer and winter B2 clipping once in winter
B3 no clipping). Test warming and clipping
effect.

37
Split plot experiment

DATA splitplot
INPUT block 1 warming 2-3 clipping 5-6 yield
DATALINES
1A3 B2 20
1A3 B1 18
1A3 B3 18
1A2 B3 20
1A2 B1 24
3A3 B3 18
3A3 B2 18
3A2 B3 23
3A2 B2 22
3A2 B1 25

PROC ANOVA
CLASS block warming clipping
MODEL yieldblock warming blockwarming clipping
warmingclipping
TEST Hwarming block Eblockwarming
MEANS warming /LSD Eblockwarming
MEANS clipping/LSD CLDIFF
RUN

We use different error items for warming and
clipping in F test as well as multiple
comparisons.
38
How if missing data? (PROC GLM)

The GLM procedure uses the method of least
squares to fit general linear models. Its
powerful procedure. You can perform regression,
analysis of variance, analysis of covariance,
multivariate analysis of variance, and partial
correlation using PROC GLM.
With PROC GLM, you can use one or several
continuous dependent variables to one or several
independent variables. The independent variables
may be either classification variables, which
divide the observations into discrete groups, or
continuous variables.
For normal balanced data, you may use PROC ANOVA.
But for unbalanced data, you should use PROC GLM.

39
Deal with missing data

PROC GLM lt options gt
CLASS variables
MODEL dependentsindependents lt / options gt
TEST lt Heffects gt Eeffect lt / options gt
MEANS effects lt / options gt
LSMEANS effects lt / options gt
OUTPUT lt OUTSAS-data-set gt
keywordnames lt ... keywordnames gt lt /
option gt
RANDOM effects lt / options gt

40
Deal with missing data

DATA unbalanced
DO variety1 to 2
DO fertilizer1 to 3
DO block1 to 3
INPUT yield _at__at_
OUTPUT
END
END
END
DATALINES
7 6 8
. 9 10
5 4 3
6 6 7
8 . 9
7 6 5

PROC GLM
CLASS variety fertilizer block
MODEL yieldblock varietyfertilizer
MEANS variety fertilizer /LSD LINES
LSMEANS variety fertilizer /T
RUN

Write a Comment

User Comments (0)

About PowerShow.com

Biostatisics and Computer Applications - PowerPoint PPT Presentation

Biostatisics and Computer Applications

1. Biostatisics and Computer Applications. ANOVA of ... Also we test the effect of clipping. ... Test warming and clipping effect. Split plot experiment. 37 ... – PowerPoint PPT presentation