Module 3: Impact Evaluation for TTLs

About This Presentation

Title:

Module 3: Impact Evaluation for TTLs

Description:

Other (using Instrumental Variables, matching, etc) ... Instrumental Variables. Some fancy statistics: Find a variable Z which satisfies two conditions: ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 80

Provided by: WB1673

Category:

more less

Transcript and Presenter's Notes

Title: Module 3: Impact Evaluation for TTLs

1
Module 3 Impact Evaluation for TTLs

Paul J. Gertler
Chief Economist, HDN
Sebastian Martinez
Impact Evaluation Cluster, AFTRL
HD Learning Week
Washington DC
November 2006

Slides by Paul Gertler and Sebastian Martinez
2
Measuring Impact

What makes a good impact evaluation?

3
Motivation

Traditional ME
Is the program being implemented as designed?
Could the operations be more efficient?
Are the benefits getting to those intended?
Monitoring trends
Are indicators moving in the right direction?
? NO inherent Causality
Impact Evaluation
What was the effect of the program on outcomes?
Because of the program, are people better off?
What would happen if we changed the program?
? Causality

4
Motivation

Objective in evaluation is to estimate the CAUSAL
effect of intervention X on outcome Y
What is the effect of a cash transfer on
household consumption?
For causal inference we must understand the data
generation process
For impact evaluation, this means understanding
the behavioral process that generates the data
how benefits are assigned

5
Causation versus Correlation

Recall correlation is NOT causation
Necessary but not sufficient condition
Correlation X and Y are related
Change in X is related to a change in Y
And.
A change in Y is related to a change in X
Causation if we change X how much does Y change
A change in X is related to a change in Y
Not necessarily the other way around

6
Causation versus Correlation

Three criteria for causation
Independent variable precedes the dependent
variable.
Independent variable is related to the dependent
variable.
There are no third variables that could explain
why the independent variable is related to the
dependent variable
External validity
Generalizability causal inference to generalize
outside the sample population or setting

7
Motivation

The word cause is not in the vocabulary of
standard probability theory.
Probability theory two events are mutually
correlated, or dependent ? if we find one, we can
expect to encounter the other.
Example age and income
For impact evaluation, we supplement the language
of probability with a vocabulary for causality.

8
Statistical Analysis Impact Evaluation

Statistical analysis Typically involves
inferring the causal relationship between X and Y
from observational data
Many challenges complex statistics
Impact Evaluation
Retrospectively
same challenges as statistical analysis
Prospectively
we generate the data ourselves through the
programs design ? evaluation design
makes things much easier!

9
How to assess impact

What is the effect of a cash transfer on
household consumption?
Formally, program impact is
a (Y P1) - (Y P0)
Compare same individual with without programs
at same point in time
So whats the Problem?

10
Solving the evaluation problem

Problem we never observe the same individual
with and without program at same point in time
Need to estimate what would have happened to the
beneficiary if he or she had not received
benefits
Counterfactual what would have
happened without the program
Difference between treated observation and
counterfactual is the estimated impact

11
Finding a good counterfactual

The treated observation and the counterfactual
have identical factors/characteristics, except
for benefiting from the intervention
No other explanations for differences in outcomes
between the treated observation and
counterfactual
The only reason for the difference in
outcomes is due to the intervention

12
Measuring Impact

Tool belt of Impact Evaluation Design Options
Randomized Experiments
Quasi-experiments
Regression Discontinuity
Difference in difference panel data
Other (using Instrumental Variables, matching,
etc)
In all cases, these will involve knowing the rule
for assigning treatment

13
Choosing your design

For impact evaluation, we will identify the
best possible design given the operational
context
Best possible design is the one that has the
fewest risks for contamination
Omitted Variables (biased estimates)
Selection (results not generalizable)

14
Case Study

Effect of cash transfers on consumption
Estimate impact of cash transfer on consumption
per capita
Make sure
Cash transfer comes before change in consumption
Cash transfer is correlated with consumption
Cash transfer is the only thing changing
consumption
Example based on Oportunidades

15
Oportunidades

National anti-poverty program in Mexico (1997)
Cash transfers and in-kind benefits conditional
on school attendance and health care visits.
Transfer given preferably to mother of
beneficiary children.
Large program with large transfers
5 million beneficiary households in 2004
Large transfers, capped at
95 USD for HH with children through junior high
159 USD for HH with children in high school

16
Oportunidades Evaluation

Phasing in of intervention
50,000 eligible rural communities
Random sample of of 506 eligible communities in 7
states - evaluation sample
Random assignment of benefits by community
320 treatment communities (14,446 households)
First transfers distributed April 1998
186 control communities (9,630 households)
First transfers November 1999

17
Oportunidades Example
18
Counterfeit CounterfactualNumber 1

Before and after
Assume we have data on
Treatment households before the cash transfer
Treatment households after the cash transfer
Estimate impact of cash transfer on household
consumption
Compare consumption per capita before the
intervention to consumption per capita after the
intervention
Difference in consumption per capita between the
two periods is treatment

19
Case 1 Before and After

Compare Y before and after intervention
ai (CPCit T1) - (CPCi,t-1 T0)
Estimate of counterfactual
(CPCi,t T0) (CPCi,t-1 T0)
Impact A-B

CPC
After
Before
A
B
t-1
t
Time
20
Case 1 Before and After
21
Case 1 Before and After

Compare Y before and after intervention
ai (CPCit T1) - (CPCi,t-1 T0)
Estimate of counterfactual
(CPCi,t T0) (CPCi,t-1 T0)
Impact A-B
Does not control for time varying factors
Recession Impact A-C
Boom Impact A-D

CPC
After
Before
A
D?
B
C?
t-1
t
Time
22
Counterfeit CounterfactualNumber 2

Enrolled/Not Enrolled
Voluntary Inscription to the program
Assume we have a cross-section of
post-intervention data on
Households that did not enroll
Households that enrolled
Estimate impact of cash transfer on household
consumption
Compare consumption per capita of those who did
not enroll to consumption per capita of those who
enrolled
Difference in consumption per capita between the
two groups is treatment

23
Case 2 Enrolled/Not Enrolled
24
Those who did not enroll.

Impact estimate ai (Yit P1) - (Yj,t P0)
,
Counterfactual (Yj,t P0) ? (Yi,t
P0)
Examples
Those who choose not to enroll in program
Those who were not offered the program
Conditional Cash Transfer
Job Training program
Cannot control for all reasons why some choose to
sign up other didnt
Reasons could be correlated with outcomes
We can control for observables..
But are still left with the unobservables

25
Impact Evaluation ExampleTwo counterfeit
counterfactuals

What is going on??
Which of these do we believe?
Problem with Before-After
Can not control for other time-varying factors
Problem with Enrolled-Not Enrolled
Do no know why the treated are treated and the
others not

26
Possible Solutions

We need to understand the data generation process
How beneficiaries are selected and how benefits
are assigned
Guarantee comparability of treatment and control
groups, so ONLY difference is the intervention

27
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (diff in diff)
Other options

28
Choosing the methodology..

Choose the most robust strategy that fits the
operational context
Use program budget and capacity constraints to
choose a design, i.e. pipeline
Universe of eligible individuals typically larger
than available resources at a single point in
time
Fairest and most transparent way to assign
benefit may be to give all an equal chance of
participating ? randomization

29
Randomization

The gold standard in impact evaluation
Give each eligible unit the same chance of
receiving treatment
Lottery for who receives benefit
Lottery for who receives benefit first

30
Population
Randomization
Sample
Randomization
Treatment Group
Control Group
31
External Internal Validity

The purpose of the first-stage is to ensure that
the results in the sample will represent the
results in the population within a defined level
of sampling error (external validity).
The purpose of the second-stage is to ensure that
the observed effect on the dependent variable is
due to some aspect of the treatment rather than
other confounding factors (internal validity).

32
Case 3 Randomization

Randomized treatment/controls
Community level randomization
320 treatment communities
186 control communities
Pre-intervention characteristics well balanced

33
Baseline characteristics
34
Case 3 Randomization
35
Impact Evaluation Example No Design v.s.
Randomization
36
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (diff in diff)
Other options

37
Case 4 Regression Discontinuity

Assignment to treatment is based on a clearly
defined index or parameter with a known cutoff
for eligibility
RD is possible when units can be ordered along a
quantifiable dimension which is systematically
related to the assignment of treatment
The effect is measured at the discontinuity
estimated impact around the cutoff may not
generalize to entire population

38
Indexes are common in targeting of social programs

Anti-poverty programs ? targeted to households
below a given poverty index
Pension programs ? targeted to population above a
certain age
Scholarships ? targeted to students with high
scores on standardized test
CDD Programs ? awarded to NGOs that achieve
highest scores

39
Example effect of cash transfer on consumption

Target transfer to poorest households
Construct poverty index from 1 to 100 with
pre-intervention characteristics
Households with a score lt50 are poor
Households with a score gt50 are non-poor
Cash transfer to poor households
Measure outcomes (i.e. consumption) before and
after transfer

40
(No Transcript)
41
Non-Poor
Poor
42
(No Transcript)
43

Treatment Effect
44
Case 4 Regression Discontinuity

Oportunidades assigned benefits based on a
poverty index
Where
Treatment 1 if score lt750
Treatment 0 if score gt750

45
Case 4 Regression Discontinuity
Baseline No treatment
2
46
Case 4 Regression Discontinuity
Treatment Period
47
Potential Disadvantages of RD

Local average treatment effects not always
generalizable
Power effect is estimated at the discontinuity,
so we generally have fewer observations than in a
randomized experiment with the same sample size
Specification can be sensitive to functional
form make sure the relationship between the
assignment variable and the outcome variable is
correctly modeled, including
Nonlinear Relationships
Interactions

48
Advantages of RD for Evaluation

RD yields an unbiased estimate of treatment
effect at the discontinuity
Can many times take advantage of a known rule for
assigning the benefit that are common in the
designs of social policy
No need to exclude a group of eligible
households/individuals from treatment

49
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (Diff in diff)
Other options

50
Case 5 Diff in diff

Compare change in outcomes between treatments and
non-treatment
Impact is the difference in the change in
outcomes
Impact (Yt1-Yt0) - (Yc1-Yc0)

51
Treatment Group
Control Group
52
Outcome
Average Treatment Effect
EstimatedAverage Treatment Effect
Treatment Group
Control Group
Time
Treatment
53
Diff in diff

Fundamental assumption that trends (slopes) are
the same in treatments and controls
Need a minimum of three points in time to verify
this and estimate treatment (two
pre-intervention)

54
Case 5 Diff in Diff
55
Impact Evaluation Example Summary of Results
56
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (Diff in diff)
Other options
Instrumental Variables
Matching

57
Other options for Impact Evaluation

There are a few others out there
Common scenario
Voluntary inscription in program
Cant control who enrolls and who does not
Possible solution random promotion or incentives
into the program
Information
Money
Other help/incentives

58
Random Promotion

Those who get promotion are more likely to enroll
But who got promotion was determined randomly, so
not correlated with other observables/non-observab
les
Compare average outcomes of two groups
promoted/not promoted
Effect of offering the program (ITT)
Effect of the intervention (TOT)
TOT effect of offering program/proportion of
those who took up

59
Example Community Based School Management

Chaudhury, Gertler, Vermeersch (work in progress)
Estimate effect of decentralization of school
management on learning outcomes
Grant for funding of community based management
Community management of hiring, budgeting,
oversight
1500 schools in the evaluation
Each community chooses whether to participate in
program
Community submits proposal for program
participation

60
Evaluation Design

Community based school management
Provision of technical assistance and training by
NGOs for submission of grant application
Random selection of communities with NGO support
Random promotion is an Instrumental Variable

61
Technique called Instrumental Variables

Some fancy statistics
Find a variable Z which satisfies two conditions
Correlated with T corr (Z , T) ? 0
Uncorrelated with e corr (Z , e) 0
Z is the random promotion in our example

62
Indirect least squares Case 1
Promotion No-Promotion Change
Takeup (T) 0.5 0 0.5
Test Score (S) 100 80 20

63
Indirect least squares Case 2
Promotion No-Promotion Change
Takeup (T) 0.8 0.3 0.5
Test Score (S) 100 90 10

64
Two Stage Least Squares (2SLS)

Model with endogenous Treatment (T)
Stage 1 Regress endogenous variable on the IV
(Z) and other exogenous regressors
Calculate predicted value for each observation T
hat

65
Two stage Least Squares (2SLS)

Stage 2 Regress outcome y on predicted variable
(and other exogenous variables)
Need to correct Standard Errors (they are based
on T hat rather than T)
In practice just use STATA - ivreg
Intuition T has been cleaned of its
correlation with e.

66
Instrumental Variables

A variable correlated with treatment but nothing
else (i.e. random promotion)
Again, we really just need to understand how the
data are generated
Dont have to exclude anyone

67
Case 6 IV

Estimate TOT effect of Oportunidades on
consumption
Run 2SLS regression

68
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (Diff in diff)
Other options
Instrumental Variables
Matching

69
Matching

Pick up the ideal comparison that matches the
treatment group from a larger survey.
The matches are selected
on the basis of
similarities in observed characteristics
This assumes no selection bias based on
unobservable characteristics.
Source Martin Ravallion

70
Propensity-Score Matching (PSM)

Controls non- participants with same
characteristics as participants
In practice, it is very hard. The entire vector
of X observed characteristics could be huge.
Rosenbaum and Rubin match on the basis of the
propensity score
P(Xi) Pr (Di1X)
Instead of aiming to ensure that the matched
control for each participant has exactly the same
value of X, same result can be achieved by
matching on the probability of participation.
This assumes that participation is independent of
outcomes given X.

71
Steps in Score Matching

Representative highly comparables survey of
non-participants and participants.
Pool the two samples and estimated a logit (or
probit) model of program participation.
Restrict samples to assure common support
(important source of bias in observational
studies)
For each participant find a sample of
non-participants that have similar propensity
scores
Compare the outcome indicators. The difference is
the estimate of the gain due to the program for
that observation.
Calculate the mean of these individual gains to
obtain the average overall gain.

72
Density
Density of scores for participants
Region of common support
0
1
Propensity score
73
PSM vs an experiment

Pure experiment does not require the untestable
assumption of independence conditional on
observables
PSM requires large samples and good data

74
Lessons on Matching Methods

Typically used when neither randomization, RD or
other quasi-experimental options are not possible
(i.e. no baseline)
Be cautious of ex-post matching
Matching on endogenous variables
Matching helps control for OBSERVABLE
heterogeneity
Matching at baseline can be very useful
Estimation
combine with other techniques (i.e. diff in diff)
Know the assignment rule (match on this rule)
Sampling
selecting non-randomized evaluation samples
Need good quality data
Common support can be a problem

75
Case 7 Matching
76
Case 7 Matching
77
Impact Evaluation Example Summary of Results
78
Measuring Impact

Experimental design/randomization
Quasi-experiments
Regression Discontinuity
Double differences (Diff in diff)
Other options
Instrumental Variables
Matching
Combinations of the above

79
Remember..

Objective of impact evaluation is to estimate the
CAUSAL effect of a program on outcomes of
interest
In designing the program we must understand the
data generation process
behavioral process that generates the data
how benefits are assigned
Fit the best evaluation design to the operational
context

Write a Comment

User Comments (0)