Title: Matching Methods
1Matching Methods Propensity Scores
- Kenny Ajayi
- October 27, 2008
Global Poverty and Impact Evaluation
2Program Evaluation Methods
- Randomization (Experiments)
- Quasi-Experiments
- Regression Discontinuity
- Matching, Propensity Score
- Difference-in-Differences
3Matching Methods
- Creating a counterfactual
- To measure the effect of a program, we want to
measure - EY D 1, X - EY D 0, X
- but we only observe one of these outcomes for
each individual.
4Evaluation Exercise
- Argentine Antipoverty Program
5Basic Idea
- Match each participant (treated) with one or more
nonparticipants (untreated) with similar observed
characteristics - Counterfactual matched comparison group
- (i.e. nonparticipants with same characteristics
as participants) - Illustrate Example
6Basic Idea
- This assumes that there is no selection bias
based on unobserved characteristics - i.e. there is selection on observables and
participation is independent of outcomes once we
control for observable characteristics (X) - What might some of these unobserved
characteristics be?
7Propensity Score
- When the set of observed variables is large, we
match participants with non participants using a
summary measure - the propensity score the probability of
participating in the program (being treated), as
a function of the individuals observed
characteristics - P(X) Prob(D 1X)
- D indicates participation in project
- X is the set of observable characteristics
8Propensity Score
- We maintain the assumption of selection on
observables - i.e., assume that participation is independent of
outcomes conditional on Xi - E (YX, D 1) E (YX, D 0)
- if there had not been a program
- This is false if there are unobserved outcomes
affecting participation
9Evaluation Exercise
- Argentine Antipoverty Program
10Propensity Score Matching
- Get representative and comparable data on
participants and nonparticipants - (ideally using the same survey a similar time
period)
11Propensity Score Matching
- Get representative and comparable data on
participants and nonparticipants - (ideally using the same survey a similar time
period) - Estimate the probability of program participation
as a function of observable characteristics - (using a logit or other discrete choice model)
12Jalan and Ravallion (2003)
13(No Transcript)
14Propensity Score Matching
- Get representative and comparable data on
participants and nonparticipants - (ideally using the same survey a similar time
period) - Estimate the probability of program participation
as a function of observable characteristics - (using a logit or other discrete choice model)
- Use predicted values from estimation to generate
propensity score p(xi) - for all treatment and comparison group members
15Propensity Score Matching
- Match Participants Find a sample of
non-participants with similar p(xi) - Restrict samples to ensure common support
16Common Support
Density
Density of scores for non- participants
Density of scores for participants
Region of common support
High probability of participating, given X
0
Low probability of participating, given X
1
Propensity score
17Propensity Score Matching
- Match Participants Find a sample of
non-participants with similar p(xi) - Restrict samples to ensure common support
- Determine a tolerance limit
- how different can matched control individuals or
villages be? - Decide on a matching technique
- Nearest neighbors, nonlinear matching, multiple
matches
18Propensity Score Matching
- Once matches are made, we can calculate impact by
comparing the means of outcomes across
participants and their matches - The difference in outcomes for each participant
and its match is the estimate of the gain due to
the program for that observation. - Calculate the mean of these individual gains to
obtain the average overall gain.
19Possible Scenarios
- Case 1 Baseline Data Exists
- Arrive at baseline, we can match participants
with nonparticipants using baseline
characteristics. - Case 2 No Baseline Data.
- Arrive afterwards, we can only match participants
with nonparticipants using time-invariant
characteristics.
20Extensions
- Matching at baseline can be very useful
- For Estimation
- Use baseline data for matching then combine with
other techniques (e.g. difference-in-differences
strategy) - Know the assignment rule, then match based on
this rule - For Sampling
- Select non-randomized (but matched) evaluation
samples - Be cautious of ex-post matching
- Matching on variables that change due to program
participation (i.e. endogenous variables) - What are some invariable characteristics?
21Key Factors
- Identification Assumption
- Selection on Observables After controlling for
observables, treated and control groups are not
systematically different - Data Requirements
- Rich data on as many observable characteristics
as possible - Large sample size (so that it is possible to find
appropriate match)
22Additional Considerations
- Advantages
- Might be possible to do with existing survey data
- Doesnt require randomization/experiment/baseline
data - Allows estimation of heterogeneous treatment
effects because we have individual
counterfactuals, instead of just having group
averages. - Doesnt require assumption of linearity
23Additional Considerations
- Disadvantages
- Strong identifying assumption that there are no
unobserved differences - but if individuals are otherwise identical, then
why did some participate and others not? - Requires good quality data
- Need to match on as many characteristics as
possible - Requires sufficiently large sample size
- Need a match for each participant in the
treatment group
24Jalan Ravallion (2003b)
- Does piped water reduce diarrhea for children in
rural India?
25Data
- Rural Household Survey
- No baseline data
- Detailed information on
- Health status of household members
- Education levels of household members
- Household income
- Access to piped water
- What would you use for D, Y, and X?
26Propensity Score Regression
27Propensity Score Regression
28Matching
- Prior to matching, the estimated propensity
scores for those with and without piped water
were, respectively, - 0.5495 and 0.1933.
- After matching there was negligible difference in
the mean propensity scores of the two groups - 0.3743, for those with piped water
- 0.3742, for the matched control group
29Results
- Prevalence and duration of diarrhea among
children under five in rural India are
significantly lower on average for families with
piped water than for observationally identical
households without it. - However, our results indicate that the health
gains largely by-pass children in poor families,
particularly when the mother is poorly educated.
30Conclusion
- Matching is a useful way to control for
OBSERVABLE heterogeneity - Especially when randomization or RD approach is
not possible - However, it requires relatively strong
assumptions