Sample Sizes for IE - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Sample Sizes for IE

Description:

Sample Sizes for IE Power Calculations Overview General question: How large does the sample need to be to credibly detect a given effect size? What does Credibly ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 17
Provided by: Lori2180
Learn more at: http://cega.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Sample Sizes for IE


1
Sample Sizes for IE
  • Power Calculations

2
Overview
  • General question How large does the sample need
    to be to credibly detect a given effect size?
  • What does Credibly mean here?
  • We can be reasonably sure that the difference
    between the treatment group and the comparison
    group is due to the program
  • Randomization removes bias, but it does not
    remove noise. To reduce noise, we need a large
    sample size. But how large is large?

3
Measuring Impact
  • At the end of an experiment, we will compare the
    outcome of interest in the treatment and the
    comparison groups.
  • We are interested in the difference
  • Mean in treatment - Mean in control Effect
    size
  • For example mean of the malaria prevalence in
    villages with ITN distribution vs. mean of
    malaria prevalence in villages with no ITNs
  • To make conclusions based on that effect size, we
    need it to be calculated with precision- since
    there is always variability in data
  • If there are other many unobserved factors
    affecting outcomes, it is harder to say whether
    the treatment had an effect

4
Precise outcomes
5
Some noise
6
Very noisy
7
Confidence Intervals
  • We only work with data which is a sample of the
    population. In order to assess whether this is
    valid for the entire population, we need a
    measure of reliability
  • A 95 confidence interval for an effect size
    tells us that, for 95 of any samples that we
    could have drawn from the same population, the
    estimated effect would have fallen into this
    interval.
  • The Standard error (se) of the estimate in the
    sample captures both the size of the sample and
    the variability of the outcome
  • it is larger with a small sample and with a
    variable outcome

8
Two Types of Errors
  • First type of error Conclude that there is an
    effect, when in fact there are no effect.
  • The level of your test is the probability that
    you will falsely conclude that the program has
    an effect, when in fact it does not.
  • So with a level of 5, you can be 95 confident
    in the validity of your conclusion that the
    program had an effect.
  • To be confident, a 5, 10, 1
  • Rule of thumb is that if the effect size is more
    than twice the standard error, you can conclude
    with more than 95 certainty that the program had
    an effect

9
Two Types of Errors
  • Second type of error you fail to reject that the
    program had no effect, when it fact it does have
    an effect.
  • The Power of a test is the probability of finding
    a significant effect in the RCT
  • Only with a significant effect can you cleanly
    influence policy
  • Power Calculations are a tool to see how likely
    we are to find a significant effect for a given
    sample size

10
What you Need for a Power Calculation
Significance level -This is often conventionally set at 5. - Lower levels (less likely to reject a false positive), we need more sample size to detect the effect
Power Level -A power level of 80 says 80 of the time, if there is a true effect you will be able to detect it in a given sample -Larger sample More Power
The mean and the variability of the outcome in the comparison group -From previous surveys conducted in similar settings -The larger the variability is, the larger the sample needed for a given power
The effect size that we want to detect -What is the smallest effect that should prompt a policy response? - The smaller the expected effect size the larger sample size needed
11
How to Determine Effect Size
  • What is the smallest effect that should justify
    the program to be adopted (in terms of
    cost-benefit)?
  • Sets minimum effect size we would want to be able
    to test for
  • Common danger use an effect size that is too
    optimistic too small of sample size
  • How large an effect you can detect with a given
    sample depends on how variable the outcomes is.
  • Example If all children have very similar
    diarrhea prevalence without a program, a very
    small impact will be easy to detect
  • The Standardized effect size is the effect size
    divided by the standard deviation of the outcome
  • Common effect sizes are .20 (small) .40
    (medium) .50 (large)

12
Design Factors to Take into Account
  • Availability of a Baseline
  • A baseline can help reduce needed sample size
    since
  • Removes some variability in data, increasing
    precision
  • Can been use it to stratify and create subgroups
  • The level of randomization
  • Whenever treatment occurs at a group level, this
    reduces power relative to randomization at
    individual level

13
Cluster (Group) Randomization
Rural Water Project Water Guard Individual
Rural Water Project Spring Improvement Village
Community-based Monitoring in Uganda Village
HIV/AIDS Education School-level
14
Implications from Group Design
  • The outcomes for all the individuals within a
    unit may be correlated
  • All villagers affected by spring improvements at
    same time
  • All students at school with trained teachers may
    have benefited from information
  • The sample size needs to be adjusted for this
    correlation
  • The more correlation within the group, the more
    we need to adjust the standard errors

15
Implications
  • It is extremely important to randomize an
    adequate number of groups.
  • Typically the number of individual within groups
    matter less than the number of groups
  • Big increases in power usually only happens when
    the number of groups that are randomized increase
  • If you randomize at the level of the district,
    with one treated district and one control
    district, you have 2 observations!

16
Conclusions
  • Power calculations involve some guess work
  • Some time we do not have the right information to
    conduct it very properly
  • However, it is important to do them to
  • Avoid launching studies that will have no power
    at all waste of time and money
  • Determine the appropriate resources to the
    studies that you decide to conduct (and not too
    much)
  • If you have a fixed budget, can determine whether
    the project is feasible at all
  • Software http//sitemaker.umich.edu/group-based/o
    ptimal_design_software
Write a Comment
User Comments (0)
About PowerShow.com