A Bestiary of Experimental and Sampling Designs - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

A Bestiary of Experimental and Sampling Designs

Description:

A Bestiary of Experimental and Sampling Designs – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 68
Provided by: Biol58
Category:

less

Transcript and Presenter's Notes

Title: A Bestiary of Experimental and Sampling Designs


1
A Bestiary of Experimental and Sampling Designs
2
REMINDERS
  • The goal of experimental design is to minimize
    the potential sources of confusion (Hurlbert
    1984)
  • Temporal (and spatial) variability
  • Procedural effects
  • Experimenter bias
  • Experimenter-generated variability (random
    error)
  • Inherent variability among experimental units
  • Non-demonic intrusion
  • it is the elementary principles of experimental
    design, not advanced or esoteric ones, which are
    most frequently and severely violated by
    ecologists...

3
The design of an experiment
  • The details of
  • Replication
  • Randomization
  • Independence
  • are these always obvious in biological
    research? Are they system-dependent?

4
We cannot draw blood from a stone
  • Even the most sophisticated analysis CANNOT
    rescue a poor design!!

5
Categorical variables
  • They are classified into one or more unique
    categories
  • Sex (male, female)
  • Trophic status (producer, herbivore, carnivore)
  • Habitat type (shade, sun)
  • Species

6
Continuous variables
  • They are measured on a continuous numerical scale
    (real or integer values)
  • Size
  • Species richness
  • Habitat coverage
  • Population density
  • NOTE Discrete random variables such as counts
    are still considered continuous variables because
    they represent a numerical scale and not a
    category

7
Dependent and independent variables
  • The assignment of dependent and independent
    variables implies a hypothesis of cause and
    effect that you are trying to test.
  • The dependent variable is the response variable
  • The independent variable is the predictor
    variable

8
Ordinate (vertical y-axis)
Abscissa (horizontal x-axis)
By convention independent variables are plotted
in the x-axis and dependent variables in the
y-axis in this example we are implying that
lambda (population growth) depends or is affected
directly by time since fire
9
Four classes of experimental design
Dependent (response) variable Independent (predictor) variable Independent (predictor) variable
Continuous Categorical
Continuous Regression ANOVA
Categorical Logistic regression Tabular
10
The Analysis of Covariance (ANCOVA)
  • It is used when there are two independent
    variables, one of which is categorical and one of
    which is continuous (the covariate)

11
Four classes of experimental design
Dependent variable Independent variable Independent variable
Continuous Categorical
Continuous Regression ANOVA
Categorical Logistic regression Tabular
12
Regression designs
  • Single-factor regression
  • Multiple regression

13
Single-factor regression
  • Collect data on a set of independent replicates.
  • For each replicate, measure both the predictor
    and the response variables.
  • e.g. Hypothesis seed density (the predictor
    variable) is responsible for rodent density (the
    response variable).

14
Plot Seeds Rodents/m2
1 50 3.2
2 12 11.7
. . .
n 300 5.3
Plots
Variables
15
Single-factor regression
  • You assume that the predictor variable is a
    causal variable changes in the value of the
    predictor would cause a change in the value of
    the response.
  • This is very different from a study in which you
    would examine the correlation (statistical
    covariation) between two variables.

16
In regression (Model I)
  • You are assuming that the value of the
    independent variable is known exactly and is not
    subject to measurement error

17
Assumptions and caveats
  • Adequate replication.
  • Independence of the data.
  • Ensure that the range of values sampled for the
    predictor variable is large enough to capture the
    full range of responses by the response variable.
  • Ensure that the distribution of predictor values
    is approximately uniform within the sample range.

18
A
What is different between these two designs?
B
Would the conclusions be different?
19
A
What is different between these two designs?
B
Would the conclusions be different?
20
Multiple regression
  • Two or more continuous predictor variables are
    measured for each replicate, along with the
    single response variable

21
Assumptions and caveats
  • Adequate replication.
  • Independence of the data.
  • Ensure that the range of values sampled for the
    predictor variables is large enough to capture
    the full range of responses by the response
    variable.
  • Ensure that the distribution of predictor values
    is approximately uniform within the sample range.

These are the same assumptions as for the
single-factor regression BUT additionally
22
Multiple regression
  • Ideally, the different predictor variables
    should be independent of one another however in
    reality, many predictor variables are correlated
    (e.g., height and weight).
  • This collinearity makes it difficult to estimate
    accurately regression parameters and to tease
    apart how much variation in the response variable
    is associated with each of the predictor
    variables.

23
Multiple regression
  • As always, replication becomes important as we
    add more predictor variables to the analysis.
  • In many cases it is easier to collect additional
    predictor variables on the same replicates than
    to obtain additional independent replicates.
  • Avoid the temptation to measure everything that
    you can just because it is possible.
  • Think about measuring variables that are
    meaningful for you study system!

24
Multiple regression
  • It is a mistake to think that a model selection
    algorithm can reliably identify the correct set
    of predictor variables...

25
Four classes of experimental design
Dependent variable Independent variable Independent variable
Continuous Categorical
Continuous Regression ANOVA
Categorical Logistic regression Tabular
26
ANOVA designs
  • Analysis of Variance
  • Treatments refers to the different categories of
    the predictor variables.
  • Replicates each of the observations made.

27
ANOVA designs
  • Single-factor designs
  • Randomized block designs
  • Nested designs
  • Multifactor designs
  • Split-plot designs
  • Repeated measurements designs
  • BACI designs (before-after-control-impact)

28
Single-factor designs
  • It is one of the simplest, but most powerful,
    experimental designs.
  • Can readily accommodate studies in which the
    number of replicates per treatment is not
    identical (unequal sample size).

29
Single-factor designs
  • In a single-factor design, each of the treatments
    represent variation in a single predictor
    variable or factor
  • Each value of the factor that represents a
    particular treatment is called a treatment level

30
Id Treatment Replicate Number of flowers
1 Watered 1 9
2 Not watered 1 4
. . . .
11 Watered 6 10
12 Not watered 6 2
31
Good news, bad news
  • This design does not explicitly accommodate
    environmental heterogeneity, so we need to sample
    the entire array of background conditions.
  • This means the results can potentially be
    generalized across all environments, BUT
  • If the background noise is much stronger than the
    signal of the treatments, the experiment may have
    low power, and therefore the analysis may not
    reveal treatment differences unless there are
    many replicates.

32
Randomized block designs
  • An effective way to incorporate environmental
    heterogeneity into a design.
  • A block is a delineated area or time period
    within which environmental conditions are
    relatively homogeneous.
  • Blocks can be placed randomly or systematically
    in the study area, but should be arranged so that
    the environmental conditions are more similar
    within blocks than between them.

33
Randomized block designs
Valid blocking
Invalid blocking
34
Randomized block designs
  • Once blocks are established, replicates will
    still be assigned randomly to treatments, but a
    single replicate from each of the treatments is
    assigned to each block.

35
Id Treatment Block Number of flowers
1 Watered 1 9
2 Not watered 1 4
. . .
11 Watered 6 10
12 Not watered 6 2
36
Caveats
  • Blocks should have enough room to accommodate a
    single replicate of each of the treatments, and
    enough spacing between replicates to ensure their
    independence.
  • The blocks themselves also have to be far enough
    apart from each other to ensure independence of
    replicates among blocks.

37
Advantages
  • It can be used to control for environmental
    gradients and patchy habitats.
  • It is useful when your replication is constrained
    by space or time.
  • Can be adapted for a matched pair lay-out.

38
Disadvantages
  • If the sample size is small and the block effect
    weak, the randomized block design is less
    powerful than the simple one-way layout.
  • If blocks are too small, you may introduce
    non-independence by physically crowding the
    treatments together (e.g., nectar-removal and
    control plots on p. 152 of Gotelli Ellison).
  • If any of the replicates are lost, the data from
    the block cannot be used unless the missing
    values can be estimated indirectly.

39
Disadvantages
  • It assumes that there is no interaction between
    the blocks and the treatments.
  • BUT, replication within blocks will indeed tease
    apart main effects, block effects, and the
    interaction between blocks and treatments. It
    will also address the problem of missing data
    from within a block.

40
Nested designs
  • It is any design in which there is subsampling
    within each of the replicates..
  • In this design the subsamples are not independent
    of one another (if we analyze them assuming
    independence is it an example of
    pseudoreplication)
  • The rational of this design is to increase the
    precision with which we estimate the response of
    each replicate.

41
Id Treatment Subsample Replicate Number of flowers
1 Watered 1 1 9
2 Watered 2 1 4
3 Watered 3 1 7
. . . . .
19 Not watered 1 7 16
20 Not watered 2 7 10
21 Not watered 3 7 2
42
Advantages
  • Subsampling increases the precision of the
    estimate for each replicate in the design.
  • Allows to test two hypothesis
  • First Is there variation among treatments?
  • Second Is there variation among replicates
    within treatments?
  • Can be extended to a hierarchical sampling
    design.

43
Disadvantages
  • They are often analyzed incorrectly!
  • It is difficult or even impossible to analyze
    properly if the sample sizes are not equal.
  • It often represents a case of misplaced sampling
    effort.
  • Subsampling is not a solution to inadequate
    replication

44
Randomized block designs
  • Strictly speaking, the randomized block and the
    nested ANOVA are two-factor designs, but the
    second factor (i.e., the blocks or subsamples) is
    included only to control for sampling variation
    and is not of primary interest.

45
Multifactor designs
  • In a multifactor design, the treatments cover two
    (or more) different factors, and each factor is
    applied in combination in different treatments.
  • In a multifactor design, there are different
    levels of the treatment for each factor.

46
Multifactor designs
  • Why not just run two separate experiments?
  • Efficiency. It is often more cost effective to
    run a single experiment than to run two separate
    experiments.
  • A multifactor design allows you to test for both
    main effects and for interaction effects.

47
Multifactor designs
  • the main effects are the additive effects of each
    level of one treatment averaged over all levels
    of the other treatment.
  • the interaction effects represent unique
    responses to particular treatment combinations
    that cannot be predicted simply from knowing the
    main effects.

48
Interactions
60
50
40
West
30
North
20
10
0
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Which of these graphs are showing interactions
between direction (west or north) and quarter
(1st to 4th)?
49
Orthogonal
  • The key element of a proper multifactorial design
    is that the treatments are fully crossed or
    orthogonal every treatment level of the first
    factor must be represented with every treatment
    level of the second factor and so on
  • If some of the treatment combinations are missing
    we end with a confounded design.

50
Two-factor design
Substrate treatment Substrate treatment Substrate treatment Substrate treatment
Granite Slate Cement
Predator treatment Unmanipulated
Predator treatment Cage Control
Predator treatment Predator exclusion
Predator treatment Predator intrusion
51
Advantages
  • The key advantage is the ability to tease apart
    main effects and interactions between factors.
    The interaction measures the extent to which
    different treatment combinations act additively,
    synergistically, or antagonistically.

52
Disadvantages
  • The number of treatment combinations can quickly
    become too large for adequate replication!
  • It does not account for spatial heterogeneity.
    This can be handled by a simple randomized block
    design, in which each block contains exactly one
    of the treatment combinations.
  • It may not be possible to establish all
    orthogonal treatment combinations.

53
Split-plot designs
  • It is an extension of the randomized block design
    to two treatments.
  • What distinguishes a split plot design from a
    randomized block design is that a second
    treatment factor is also applied, this time at
    the level of the entire plot.

54
Split plot design
Substrate treatment The subplot factor Substrate treatment The subplot factor Substrate treatment The subplot factor Substrate treatment The subplot factor
Granite Slate Cement
Predator treatment The whole-plot factor Unmanipulated
Predator treatment The whole-plot factor Control
Predator treatment The whole-plot factor Predator exclusion
Predator treatment The whole-plot factor Predator intrusion
55
Advantages
  • The chief advantage is the efficient use of
    blocks for the application of two treatments.
  • This is a simple layout that controls for
    environmental heterogeneity.

56
Disadvantages
  • As with nested designs, a very common mistake is
    for investigators to analyze a split-plot design
    as a two factor ANOVA

57
Repeated measurements designs
  • It is used whenever multiple observations on the
    same replicate are collected at different times
    (it can be thought of as a split-plot in which a
    single replicate serves as a block, and the
    subplot factor is time).

58
Repeated measurements designs
  • The between-subjects factor corresponds to the
    whole-plot factor.
  • The within-subjects factor corresponds to the
    different times.
  • The multiple observations on a single individual
    are not independent of one another why do you
    think this is?

59
Advantages
  • Efficiency.
  • It allows each replicate to serve as its own
    block or control.
  • It allows us to test for interactions between
    treatments and time.

60
Circularity
  • Both the randomized block and the repeated
    measures designs make a special assumption of
    circularity for the within-subjects factor.
  • It means that the variance of the difference
    between any two treatment levels in the subplots
    is always the same i.e. there is the same
    variance between t1 and t2, as between t2 and t3,
    etc..

61
For repeated measures design it means that the
variance of the difference of observations
between any pair of times is the same
This assumption is unlikely to be met in
biological systems because of their temporal
memory!
62
Disadvantages
  • In many cases the assumption of circularity is
    unlikely to be met for repeated measures.
  • The best way to meet the circularity assumption
    is to use evenly spaced sampling times along with
    knowledge of the natural history of your
    organisms to select the appropriate sampling
    interval.

63
Alternatives
  1. To set enough replicates so that a different set
    is sampled at each time period. With this design,
    time can be treated as a simple factor in a
    two-factor analysis of variance.
  2. Use the repeated measures layout but collapse the
    correlated repeated measures into a single
    response variable for each individual, and then
    use a simple one-factor analysis of variance i.e.
    instead of height at age 0 and height at age 1
    use growth

64
Think outside the ANOVA Box
  • Many ecological experiments test a continuous
    predictor at only a few values so they can be
    shoehorned into an ANOVA design
  • One Alternative Experimental regression design!

65
Four classes of experimental design
Dependent variable Independent variable Independent variable
Continuous Categorical
Continuous Regression ANOVA
Categorical Logistic regression Tabular
66
Tabular designs
  • The measurements of these designs are counts.
  • A contingency table analysis is used to test
    hypotheses.
  • we will cover this later on

67
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com