Regression Discontinuity - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Regression Discontinuity

Description:

Regression Discontinuity 10/13/09 – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 34
Provided by: Garre87
Learn more at: http://cega.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: Regression Discontinuity


1
Regression Discontinuity
  • 10/13/09

2
What is R.D.?
  • Regression--the econometric/statistical tool
    social scientists use to analyze multivariate
    correlations

Where Y is some sort of dependent variable,
alphas a constant, the Xs are a bunch of
independent variables, the betas are
coefficients, and the e is the error term.
3
Discontinuity
  • Some sort of arbitrary jump/change thanks to a
    quirk in law or nature.

4
Discontinuity
  • Some sort of arbitrary jump/change thanks to a
    quirk in law or nature.
  • Were interested in the ones that make very
    similar people get very dissimilar results.

5
Discontinuity Examples
  • PSAT/NMSQT
  • Basically the top 16,000 test-takers get a
    scholarship.
  • A small difference in test score can means a
    discontinuous jump in scholarship amount.

6
Discontinuity Examples
  • School Class Size
  • Maimonides Rule--No more than 40 kids in a class
    in Israel.
  • 40 kids in school means 40 kids per class. 41
    kids means two classes with 20 and 21.
  • (Angrist Lavy, QJE 1999)

7
Discontinuity Examples
  • Union Elections
  • If employers want to unionize, NLRB holds
    election. 50 means the employer doesnt have to
    recognize the union, and 50 1 means the
    employer is required to bargain in good faith
    with the union.
  • (DiNardo Lee, QJE 2004)

8
Discontinuity Examples
  • U.S. House Elections
  • Incumbency advantage. If youre first past the
    pole in the previous election, even by just one
    vote, you get a huge advantage in the next
    election.
  • (David Lee, Journal of Econometrics 2007)

9
Discontinuity Examples
  • Air Pollution and Home Values
  • The Clean Air Acts National Ambient Air Quality
    Standards say if the geometric mean concentration
    of 5 pollutant particulates is 75 micrograms per
    cubic meter or greater, county is classified as
    non-attainment and are subject to much more
    stringent regulation.
  • (Ken Chay, Michael Greenstone, JPE 2005)

10
Combine the R and the D
  • Run a regression based on a situation where
    youve got a discontinuity.
  • Treat above-the-cutoff and below-the-cutoff like
    the treatment and control groups from a
    randomization.

11
Why are we doing this?
  • Why do we have to look for quirks like this?
    Cant we just control for whatever we want using
    OLS or some other line-fitting tool?
  • Just get a bunch of peoples salaries and PSAT
    scores. PSATs are X, income is Y, run a
    regression in Stata, and we have causal
    inference, right? Higher test scores cause
    people to earn more later in life.

12
No.
  • The statistical methods we use are based on lot
    of assumptions. Importantly, the error terms
    (which is really full of things we cant measure,
    the unobservables) are supposed to be
    uncorrelated with the Xs and normally
    distributed.
  • In reality, those conditions probably hasnt been
    met in any of the previous situations.

13
No.
  • The statistical methods we use are based on lot
    of assumptions. Importantly, the error terms
    (which is really full of things we cant measure,
    the unobservables) are supposed to be
    uncorrelated with the Xs and normally
    distributed.
  • In reality, those conditions probably hasnt been
    met in any of the previous situations.
  • For example, class size is probably correlated
    with some type of neighborhood quality.
  • Please turn to your neighbor and discuss what is
    probably wrong with each of the previous 5
    examples (PSAT, class size, union elections,
    house elections, air pollution)

14
No.
  • Higher PSAT kids might have higher ability.
  • Crowded classrooms might be in poorer schools.
  • (Or special needs students might be in small
    classes.)
  • Unionized workers might work for certain types of
    firms.
  • Incumbent politicians might be better. They won
    before, didnt they?
  • Pollution might be correlated to economic growth,
    which could increase home values.

15
Controlling for everything?
  • Focus on the Israeli schools for a second.
  • We can try and control for neighborhood poverty
    level.
  • Does that solve the problem?
  • No.
  • If neighborhood poverty level (observables) are
    correlated with the X of interest (class size)
    why would you think its safe to assume that the
    unobservables arent correlated? Have you
    magically controlled for every single thing
    thats correlated with the X of interest?
    Probably not.

16
Controlling for everything?
  • Focus on the Israeli schools for a second.
  • We can try and control for neighborhood poverty
    level.
  • Does that solve the problem?
  • No.
  • If neighborhood poverty level (observables) are
    correlated with the X of interest (class size)
    why would you think its safe to assume that the
    unobservables arent correlated? Have you
    magically controlled for every single thing
    thats correlated with the X of interest?
    Probably not.
  • So lets find a bandwidth in which these things
    are uncorrelated.

17
A Bandwidth of Randomness
  • Test scores arent random, and neither is class
    size, nor air pollution.
  • But is a kid in the 94.9th percentile really that
    different from the 95th percentile kid?
  • Is a school with 40 kids that different from a
    school with 41?
  • Right around the cutoff, theres a good chance
    things are random.

18
No Sorting - Observables
  • Dont take my word for it. Look at the averages
    of the observables in your below-cutoff group,
    and the averages of the observables in the
    above-cutoff group. Are they the same?
    Hopefully.
  • Do people know about this cutoff? Are they doing
    some endogenous sorting? When deciding where to
    live, did good moms look for schools where their
    kids would be the 41st kid? Did certain types of
    polluters look for counties where theyd be
    below the cutoff?
  • These things can be checked to some degree--look
    at the average observables above and below the
    cutoff.

19
No Sorting - Clumping
  • In addition to checking the observables on either
    side of the cutoff, we should check the density
    of the distribution. Is it unusually low/high
    right around the cutoff?
  • If theres some abnormally large portion of
    people right around the cutoff, its quite
    possible that you dont have random assignment.

20
No Sorting - Clumping
  • Youre totally cheating. Please stop.
  • Emily Conover Adriana Camacho Manipulation of
    Social Program Eligibility

21
GSP--Multiple Analyses
  • Incentives to Learn, Ted Miguel, Michael
    Kremer, Rebecca Thornton
  • Girls Scholarship Program, Busia Kenya.
  • Randomize holding a scholarship competition
    across schools in Busia and Teso districts.

22
GSP--Multiple Analyses
  • Incentives to Learn, Ted Miguel, Michael
    Kremer, Rebecca Thornton
  • Girls Scholarship Program, Busia Kenya.
  • Randomize holding a scholarship competition
    across schools in Busia and Teso districts.
  • Treatment If a girl finishes in the top 15 in
    her district on the end-of-year exam, she wins a
    two-year scholarship.
  • Randomization Analysis Does attending a school
    with the competition make you work harder/improve
    schooling outcomes?
  • RD Analysis Does winning the award improve
    schooling outcomes?

23
P-900 in Chile
  • The Central Role of Noise in Evaluating
    Interventions That Use Test Scores to Rank
    Schools Kenneth Y. Chay, Patrick J. Mcewan,
    Miguel Urquiola, AER 2005
  • Mean Reversion Sophomore Slump, SI Cover Curse,
    Heisman Trophy Curse, Madden curse, and in the
    opposite direction.

24
THIS IS THE MOST AMAZING THING EVER!
  • Look at the educational outcomes of treatment
    schools in 1990, compared to those same schools
    in 1988, before the program. AMAZING!
    FANTABULOUS!

25
Oh, wait.
  • Hmm. Thats kind of disappointing.

26
So how do we actually do this?
  1. Draw two pretty pictures
  2. Eligibility criterion (test score, income, or
    whatever) vs. Program Enrollment
  3. Eligibility criterion vs. Outcome

27
So how do we actually do this?
  • 2. Run a simple regression.
  • (Yes, this is basically all we ever do, and
    STATA can run the calculation in almost any
    situation, but before we do it, its necessary to
    make sure the situation is appropriate and draw
    the graphs so that we can have confidence that
    our estimates are actually causal.)
  • Outcome as a function of test score (or
    whatever), with a binary (1 if yes, 0 if no)
    variable for program enrollment.

28
Is it really that simple?
  • Not quite.
  • You could totally have a situation where the
    outcome is some sort of quadratic or cubic or nth
    polynomial function of the test score. Try
    controlling for that. This is going to depend on
    the situation and is somewhat arbitrary.

29
Wait, somewhat arbitrary?
  • Lame, I know. But two things arent universally
    clear
  • 1. How wide a bandwidth around the cutoff are we
    looking at?
  • Were really only confident in our estimate for
    people that are close to the cutoff. This is a
    LOCAL AVERAGE TREATMENT EFFECT. We can
    confidently say that a school right around the
    cutoff would improve average test scores by X if
    they received the treatment, but were not so
    confident that already-awesome schools would get
    the same benefit.

30
Wait, somewhat arbitrary?
  • 2. Without the program, what shaped function
    would there be naturally?
  • What sort of function do we throw in to control
    for the fact that even if there was no National
    Merit Semifinalist scholarship, smarter kids are
    likely to earn more later in life?
  • The solution SHOW YOUR WORK

31
Fake Programs
  • In addition to showing your work, another good
    robustness check is to test for the effects of
    non-existent programs.

32
Fake Programs
33
Conclusion
  • Find a threshold
  • Look at people just above and just below
  • Make sure theres no sorting
  • Its only a local effect
Write a Comment
User Comments (0)
About PowerShow.com