Problems with the Design and Implementation of Randomized Experiments - PowerPoint PPT Presentation

About This Presentation
Title:

Problems with the Design and Implementation of Randomized Experiments

Description:

... attrition from my study after assignment. Does that cause a serious problem? ... Assignment ... Post Assignment Attrition. The treatment effect on all ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 49
Provided by: ies7
Learn more at: https://ies.ed.gov
Category:

less

Transcript and Presenter's Notes

Title: Problems with the Design and Implementation of Randomized Experiments


1
Problems with the Design and Implementation of
Randomized Experiments
  • ByLarry V. HedgesNorthwestern University

Presented at the 2009 IES Research Conference
2
Hard Answers to Easy Questions
  • ByLarry V. HedgesNorthwestern University

Presented at the 2009 IES Research Conference
3
Easy Question
  • Isnt it ok if I just match (schools) on some
    variable before randomizing?
  • (You know lots of people do it)
  • This is a simple question, but giving it an
    answer requires serious thinking about design and
    analysis

4
What Does this Question Mean?
  • Generally adding matching or blocking variables
    means adding another (blocking) factor to the
    design
  • The exact consequences depend on the design you
    started with
  • Individually randomized (completely randomized
    design)
  • Cluster randomized (hierarchical design)
  • Multicenter or matched (randomized blocks design)

5
Individually Randomized (Completely Randomized)
Design
  • In this case you are adding a blocking factor
    crossed with treatment (p blocks)
  • In other words, the design becomes a
    (generalized) randomized block design

Blocks Blocks Blocks Blocks
1 2 p
T        
C        
6
Individually Randomized (Completely Randomized)
Design
  • How does this impact the analysis?
  • Think about a balanced design with 2n students
    per block and p blocks and the ANOVA partitioning
    of sums of squares and degrees of freedom
  • Original partitioning
  • SSTotal SST SSWT
  • dfTotal dfT dfWT
  • 2pn 1 1 2pn 2
  • Original test statistic
  • F SST/(SSWT/dfWT)

7
Individually Randomized (Completely Randomized)
Design
  • New partitioning
  • SSTotal SST SSB SSBxT SSWC
  • dfTotal dfT dfB dfBxT
    dfWC
  • 2pn 1 1 (p 1) (p 1) 2p(n 1)
  • New test statistic ?
  • F SST/(SSWC/dfWC)
  • Or
  • F SST/(SSBxT/dfBxT)
  • It depends on the inference model

8
Individually Randomized (Completely Randomized)
Design

Original Design Blocked Design
SS SST SSWT SS SST (SSB SSBxT SSWC)
df dfT dfWT df dfT (dfB dfBxT dfWC)
2pn1 1 (2pn 2) 2pn1 1 (p-1) (p-1) 2p(n-1)
9
Inference Models
  • I will mention two inference models
  • Conditional inference model
  • Unconditional inference model
  • These inference models determine the type of
    inference (generalization) you wish to make
  • Inference model chosen has implications for the
    statistical analysis procedure chosen
  • The inference model determines the natural random
    effects

10
Inference Models
  • Conditional Inference Model
  • Generalization is to the blocks actually in the
    experiment (or those just like them)
  • Blocks in the experiment are the universe
    (population)
  • Generalization to other blocks depends on
    extra-statistical considerations (which blocks
    are just like them? How do you know?)
  • Generalization obviously cannot be model free

11
Inference Models
  • Unconditional Inference model
  • Generalization is to a universe (of blocks)
    including blocks not in the experiment
  • Blocks in the experiment are a sample of blocks
    in the universe (population)
  • If blocks in the experiment can be considered a
    representative sample, inference to the
    population of blocks is by sampling theory
  • If blocks are not a probability sample,
    generalization gets tricky (what is the universe?
    How do you know?)

12
Inference Models
  • You can think of the inference model as linked to
    the sampling model for blocks
  • If the blocks observed are a (random) sample of
    blocks, then they are a source of random
    variation
  • If blocks observed are the entire universe of
    relevant blocks, then they are not a source of
    random variation
  • The statistical analysis can be chosen
    independently of the inference model, but if it
    doesnt include all sources of random variation,
    inferences will be compromised

13
Inference Models and Statistical
AnalysesIndividually Randomized Design
  • Blocks are fixed effects under the conditional
    inference models
  • In this case the correct test statistic is
  • FC SST/(SSWC/dfWC)
  • and the F-distribution has 1 2p(n -1) df
  • Block effects are random under the unconditional
    inference model
  • In this case the correct test statistic is
  • FU SST/(SSBxT/dfBxT)
  • and the F-distribution has 1 (p -1) df

14
Inference Models and Statistical
AnalysesIndividually Randomized Design
  • You can see that the error term in the test has
    (a lot) more df under fixed effects model 2p(n
    1) versus (p 1)
  • What you cant see is that (if there is a
    treatment effect) the average value of the
    F-statistic is typically also larger under the
    fixed effects model
  • It is bigger by a factor proportional to
  • where ? sBxT2/sB2 is a treatment heterogeneity
    parameter and ? is the intraclass correlation and

15
Possible Statistical Analyses Individually
Randomized Design
  • Possible statistical analyses
  • Ignore the blocking
  • Include blocks as fixed effects
  • Include blocks as random effects
  • Consequences depend on whether you want to make a
    conditional or unconditional inference

16
Making Unconditional Inferences Individually
Randomized Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as fixed effects
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as random effects
  • Correct significance levels (but less power than
    conditional analysis)

17
Making Conditional Inferences Individually
Randomized Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea May deflate actual significance levels
    of tests for treatment effects substantially
    (unless ? 0)
  • Include blocks as fixed effects
  • Correct significance levels and more powerful
    test than for unconditional analysis
  • Include blocks as random effects
  • Bad idea May deflate significance levels and
    reduce power

18
Cluster Randomized (Hierarchical) Design
  • The issues about blocking in the cluster
    randomized design are the same as in the
    individually randomized design
  • The inference model will determine the most
    appropriate statistical analysis
  • Examining the properties of the statistical
    analysis may also reveal the weakness of the
    design for a given inference purpose
  • For example, a small number of blocks may provide
    only very uncertain inference to a universe of
    blocks based on sampling arguments

19
Cluster Randomized (Hierarchical) Design
  • In this case you are adding a blocking factor
    crossed with treatment (p blocks) but clusters
    are still nested within treatments here Cij is
    the jth cluster in the ith block
  • Note that there are m clusters in each treatment
    per block

Block 1 Block 1 Block p Block p
C11, , C1m C1(m1), , C2m Cp1, , Cpm Cp(m1), , Cp(2m)
T   ---   ---
C ---   ---  
20
Cluster Randomized (Hierarchical) Design
  • How does this impact the analysis?
  • Think about a balanced design with 2mn students
    per block and p blocks and the ANOVA partitioning
    of sums of squares and degrees of freedom
  • Original partitioning
  • SSTotal SST SSC SSWCT
  • dfTotal dfT dfC dfWCT
  • 2mn 1 1 2(m 1) 2m(n 1)
  • Original test statistic
  • F SST/(SSc/dfC)

21
Cluster Randomized (Hierarchical) Design
  • New partitioning
  • SSTotal SST SSB SSBxT SSCBxT SSWC
  • dfTotal dfT dfB dfBxT dfCBxT dfWC
  • 2mpn 1 1 (p 1) (p 1) 2p(m 1) 2pm
    (n 1)
  • New test statistic ?
  • F SST/(SSWT/dfWT)
  • F SST/(SSCBxT/dfCBxT)

22
Inference Models and Statistical Analyses Cluster
Randomized Design
  • Blocks are fixed under the conditional inference
    model, but clusters are typically random
  • In this case the correct test statistic is
  • FC SST/(SSCBxT/dfCBxT)
  • and the F-distribution has 1 2p(m 1) df
  • Blocks are random under the unconditional
    inference model, but clusters are typically
    random
  • In this case there is no exact ANOVA test if
    there are block treatment interactions, but a
    conservative test uses the test statistic
  • FC SST/(SSB/dfB)
  • and the F-distribution has 1 (p 1) df (large
    sample tests, e.g., based on HLM, are available)

23
Inference Models and Statistical Analyses Cluster
Randomized Design
  • You can see that the error term has more df under
    fixed effects model
  • If there is a treatment effect the average value
    of the F-statistic is also larger under the fixed
    effects model
  • It is bigger by a factor proportional to
  • where ?B sBxT2/sB2 is a treatment heterogeneity
    parameter and ?B and ?C are the block and cluster
    level intraclass correlations, respectively and

24
Possible Statistical AnalysesCluster Randomized
Design
  • Possible statistical analyses
  • Ignore the blocking
  • Include blocks as fixed effects
  • Include blocks as random effects
  • Consequences depend on whether you want to make a
    conditional or unconditional inference

25
Making Unconditional InferencesCluster
Randomized Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as fixed effects
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as random effects
  • Correct significance levels but less power than
    conditional analysis

26
Making Conditional InferencesCluster Randomized
Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea May deflate actual significance levels
    of tests for treatment effects substantially
  • Include blocks as fixed effects
  • Correct significance levels and more powerful
    test than for unconditional analysis
  • Include blocks as random effects
  • Not such a bad idea significance levels
    unaffected

27
Multi-center (Randomized Blocks) Design
  • The issues about blocking in the multicenter
    (randomized blocks) design are the same as in the
    cluster randomized design
  • The inference model will determine the most
    appropriate statistical analysis
  • Examining the properties of the statistical
    analysis may also reveal the weakness of the
    design for a given inference purpose
  • For example, a small number of blocks may provide
    only very uncertain inference to a universe of
    blocks based on sampling arguments

28
Multi-center (Randomized Blocks) Design
  • In this case you are adding a blocking factor
    crossed with treatment (p blocks) and clusters,
    but clusters are still nested within blocks here
    Cij is the jth cluster in the ith block
  • Note that there are m clusters in each treatment
    per block and n individuals in each treatment in
    each cluster

Block 1 Block 1 Block 1 Block p Block p Block p
C11 C1m Cp1 Cpm
T        
C        
29
Multi-center (Randomized Blocks) Design
  • How does this impact the analysis?
  • Think about a balanced design with 2mn students
    per block and p blocks n individuals per cell and
    the ANOVA partitioning of sums of squares and
    degrees of freedom
  • Original partitioning
  • SSTotal SST SSC SSTxC SSWC
  • dfTotal dfT dfC dfTxC dfWC
  • 2pmn 1 1 (pm 1) (pm 1) 2pm(n
    1)
  • Original test statistic
  • F SST/(SSTxC/dfTxC)

30
Multi-center (Randomized Blocks) Design
  • New partitioning
  • SSTotal SST SSB SSCB SSBxT SSCBxT
    SSWC
  • dfTotal dfT dfB dfCB dfBxT
    dfCBxT dfWC
  • 2mpn 1 1 (p 1) p(m 1) (p 1) 2p(m
    1) 2pm (n 1)
  • New test statistic ?
  • F SST/(SSWC/dfWC)
  • F SST/(SSBxT/dfBxT)
  • F SST/(SSBxT/dfBxT)

31
Inference Models and Statistical Analyses
Randomized Blocks Design
  • Blocks are fixed under the conditional inference
    models, but clusters are typically random
  • In this case the correct test statistic is
  • FC SST/(SSCBxT/dfCBxT)
  • and the F-distribution has 1 p(m 1) df
  • Blocks are random under the unconditional
    inference model, but clusters are typically
    random
  • In this case the correct test statistic is
  • FU SST/(SSBxT/dfBxT)
  • and the F-distribution has 1 (p 1) df

32
Inference Models and Statistical Analyses
Randomized Blocks Design
  • You can see that the error term has more df under
    fixed effects model
  • If there is a treatment effect the average value
    of the F-statistic is also larger under the fixed
    effects model
  • It is bigger by a factor proportional to
  • where ?B sBxT2/sB2 and ?C sCxT2/sC2 are
    treatment heterogeneity parameters and ?B and ?C
    are the block and cluster level intraclass
    correlations, respectively and

33
Possible Statistical AnalysesRandomized Blocks
Design
  • Possible statistical analyses
  • Ignore the blocking
  • Include blocks as fixed effects
  • Include blocks as random effects
  • Consequences depend on whether you want to make a
    conditional or unconditional inference

34
Making Unconditional Inferences Randomized
Blocks Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as fixed effects
  • Bad idea Will inflate significance levels of
    tests for treatment effects substantially
  • Include blocks as random effects
  • Correct significance levels but less power than
    conditional analysis

35
Making Conditional Inference Randomized Blocks
Design
  • Possible statistical analyses
  • Ignore the blocking
  • Bad idea May deflate actual significance levels
    of tests for treatment effects substantially
  • Include blocks as fixed effects
  • Correct significance levels and more powerful
    test than for unconditional analysis
  • Include blocks as random effects
  • Bad idea May deflate significance levels and
    reduce power

36
Another Easy Question
  • There was some attrition from my study after
    assignment. Does that cause a serious problem?
  • This is another simple question, but the answer
    is far from simple. One answer can be framed
    using concepts of experimental design

37
Post Assignment Attrition
  • A different question has a simple answer
  • Does that (attrition) cause a problem in
    principle?
  • The simple answer to that question is YES!
  • Randomized experiments with attrition no longer
    give model free, unbiased estimates of the causal
    effect of treatment
  • Whether the bias is serious or not depends (on
    the model that generates the missing data)

38
Post Assignment Attrition
  • The design is changed by adding a crossed factor
    corresponding to missingness like this
  • Now we can see a problem with estimating
    treatment effect from only the observed part of
    the design The observed treatment effect is only
    part of the total treatment effect

Observed Missing
T    
C    
39
Post Assignment Attrition
  • Suppose that the means are given by the µs and
    the proportions are given by the ps

Observed Observed   Missing Missing
Proportion Mean   Proportion Mean
T   µTO  µTM
C   µCO       µCM
40
Post Assignment Attrition
  • The treatment effect on all individuals
    randomized is
  • When the proportion of dropouts is equal in T and
    C so that
  • pT pC p
  • The mean of the treatment effect on all
    individuals randomized is

41
Post Assignment Attrition
  • Rewriting this we see that the average treatment
    effect for individuals assigned to treatment is
  • where dO is the treatment effect among the
    individuals that are observed and dM is the
    treatment effect among the individuals that are
    not observed and d is the treatment effect among
    all individuals assigned
  • Thus bounds on dM imply bounds on d
  • l

42
Post Assignment Attrition
  • No estimate of the treatment effect is possible
    without an estimate of the treatment effect among
    the missing individuals
  • One possibility is to model (assume) that we know
    something about the treatment effect in the
    missing individuals
  • We can assume a range of values to get bounds on
    the possible treatment effect

43
Post Assignment Attrition
  • When attrition rate is not the same in the
    treatment groups (pT ? pC) the analysis is
    trickier
  • One idea is to convince ourselves that the
    treatment effect for those who drop out is the
    same as those who do not

  Observed   Missing
  Mean   Mean
T 90 33
C 67 10
T-C 23 23
44
Post Assignment Attrition
  • This does not assure that attrition has not
    altered the treatment effect
  • l

  Observed   Missing
  Mean   Mean
T 90 33
C 67 10
T-C 23 23
45
Post Assignment Attrition
  • This does not assure that attrition has not
    altered the treatment effect
  • We have to know both µTM and µCM to identify the
    treatment effect, knowing dM (µTM µCM) is not
    enough

  Observed Observed   Missing Missing   Total Total
  n Mean   n Mean   n Mean
T 10 90 90 33 100 39
C 90 67 10 10 100 61
T-C 23 23 -23
46
Post Assignment Attrition
  • Suppose that
  • BLTM and BLCM are lower bounds on the means for
    missing individuals in the treatment group and
  • BUTM and BUCM are the upper bounds
  • Then the upper and lower bounds on the treatment
    effect are
  • Lower
  • Upper

47
Post Assignment Attrition
  • Note that none of the results on attrition
    involve sampling or estimation error
  • Results get more complex if we take this into
    account, but the basic ideas are those here

48
Conclusions
  • Many simple questions arise in connection with
    field experiments
  • The answers to these questions often require
    thinking through complex aspects of
  • the design
  • the inference model
  • assumptions about missing data
  • No correct answers are possible without
    recognizing these complexities
Write a Comment
User Comments (0)
About PowerShow.com