Matching Methods - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Matching Methods

Description:

Rural Household Survey. No baseline data. Detailed information on: ... Matching is a useful way to control for OBSERVABLE heterogeneity ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 31
Provided by: Kehi1
Category:
Tags: get | high | household | matching | methods | to | ways

less

Transcript and Presenter's Notes

Title: Matching Methods


1
Matching Methods Propensity Scores
  • Kenny Ajayi
  • October 27, 2008

Global Poverty and Impact Evaluation
2
Program Evaluation Methods
  • Randomization (Experiments)
  • Quasi-Experiments
  • Regression Discontinuity
  • Matching, Propensity Score
  • Difference-in-Differences

3
Matching Methods
  • Creating a counterfactual
  • To measure the effect of a program, we want to
    measure
  • EY D 1, X - EY D 0, X
  • but we only observe one of these outcomes for
    each individual.

4
Evaluation Exercise
  • Argentine Antipoverty Program

5
Basic Idea
  • Match each participant (treated) with one or more
    nonparticipants (untreated) with similar observed
    characteristics
  • Counterfactual matched comparison group
  • (i.e. nonparticipants with same characteristics
    as participants)
  • Illustrate Example

6
Basic Idea
  • This assumes that there is no selection bias
    based on unobserved characteristics
  • i.e. there is selection on observables and
    participation is independent of outcomes once we
    control for observable characteristics (X)
  • What might some of these unobserved
    characteristics be?

7
Propensity Score
  • When the set of observed variables is large, we
    match participants with non participants using a
    summary measure
  • the propensity score the probability of
    participating in the program (being treated), as
    a function of the individuals observed
    characteristics
  • P(X) Prob(D 1X)
  • D indicates participation in project
  • X is the set of observable characteristics

8
Propensity Score
  • We maintain the assumption of selection on
    observables
  • i.e., assume that participation is independent of
    outcomes conditional on Xi
  • E (YX, D 1) E (YX, D 0)
  • if there had not been a program
  • This is false if there are unobserved outcomes
    affecting participation

9
Evaluation Exercise
  • Argentine Antipoverty Program

10
Propensity Score Matching
  • Get representative and comparable data on
    participants and nonparticipants
  • (ideally using the same survey a similar time
    period)

11
Propensity Score Matching
  • Get representative and comparable data on
    participants and nonparticipants
  • (ideally using the same survey a similar time
    period)
  • Estimate the probability of program participation
    as a function of observable characteristics
  • (using a logit or other discrete choice model)

12
Jalan and Ravallion (2003)
13
(No Transcript)
14
Propensity Score Matching
  • Get representative and comparable data on
    participants and nonparticipants
  • (ideally using the same survey a similar time
    period)
  • Estimate the probability of program participation
    as a function of observable characteristics
  • (using a logit or other discrete choice model)
  • Use predicted values from estimation to generate
    propensity score p(xi)
  • for all treatment and comparison group members

15
Propensity Score Matching
  • Match Participants Find a sample of
    non-participants with similar p(xi)
  • Restrict samples to ensure common support

16
Common Support
Density
Density of scores for non- participants
Density of scores for participants
Region of common support
High probability of participating, given X
0
Low probability of participating, given X
1
Propensity score
17
Propensity Score Matching
  • Match Participants Find a sample of
    non-participants with similar p(xi)
  • Restrict samples to ensure common support
  • Determine a tolerance limit
  • how different can matched control individuals or
    villages be?
  • Decide on a matching technique
  • Nearest neighbors, nonlinear matching, multiple
    matches

18
Propensity Score Matching
  • Once matches are made, we can calculate impact by
    comparing the means of outcomes across
    participants and their matches
  • The difference in outcomes for each participant
    and its match is the estimate of the gain due to
    the program for that observation.
  • Calculate the mean of these individual gains to
    obtain the average overall gain.

19
Possible Scenarios
  • Case 1 Baseline Data Exists
  • Arrive at baseline, we can match participants
    with nonparticipants using baseline
    characteristics.
  • Case 2 No Baseline Data.
  • Arrive afterwards, we can only match participants
    with nonparticipants using time-invariant
    characteristics.

20
Extensions
  • Matching at baseline can be very useful
  • For Estimation
  • Use baseline data for matching then combine with
    other techniques (e.g. difference-in-differences
    strategy)
  • Know the assignment rule, then match based on
    this rule
  • For Sampling
  • Select non-randomized (but matched) evaluation
    samples
  • Be cautious of ex-post matching
  • Matching on variables that change due to program
    participation (i.e. endogenous variables)
  • What are some invariable characteristics?

21
Key Factors
  • Identification Assumption
  • Selection on Observables After controlling for
    observables, treated and control groups are not
    systematically different
  • Data Requirements
  • Rich data on as many observable characteristics
    as possible
  • Large sample size (so that it is possible to find
    appropriate match)

22
Additional Considerations
  • Advantages
  • Might be possible to do with existing survey data
  • Doesnt require randomization/experiment/baseline
    data
  • Allows estimation of heterogeneous treatment
    effects because we have individual
    counterfactuals, instead of just having group
    averages.
  • Doesnt require assumption of linearity

23
Additional Considerations
  • Disadvantages
  • Strong identifying assumption that there are no
    unobserved differences
  • but if individuals are otherwise identical, then
    why did some participate and others not?
  • Requires good quality data
  • Need to match on as many characteristics as
    possible
  • Requires sufficiently large sample size
  • Need a match for each participant in the
    treatment group

24
Jalan Ravallion (2003b)
  • Does piped water reduce diarrhea for children in
    rural India?

25
Data
  • Rural Household Survey
  • No baseline data
  • Detailed information on
  • Health status of household members
  • Education levels of household members
  • Household income
  • Access to piped water
  • What would you use for D, Y, and X?

26
Propensity Score Regression
27
Propensity Score Regression
28
Matching
  • Prior to matching, the estimated propensity
    scores for those with and without piped water
    were, respectively,
  • 0.5495 and 0.1933.
  • After matching there was negligible difference in
    the mean propensity scores of the two groups
  • 0.3743, for those with piped water
  • 0.3742, for the matched control group

29
Results
  • Prevalence and duration of diarrhea among
    children under five in rural India are
    significantly lower on average for families with
    piped water than for observationally identical
    households without it.
  • However, our results indicate that the health
    gains largely by-pass children in poor families,
    particularly when the mother is poorly educated.

30
Conclusion
  • Matching is a useful way to control for
    OBSERVABLE heterogeneity
  • Especially when randomization or RD approach is
    not possible
  • However, it requires relatively strong
    assumptions
Write a Comment
User Comments (0)
About PowerShow.com