GY460 Techniques of Spatial Analysis - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

GY460 Techniques of Spatial Analysis

Description:

Sometimes we just want to eliminate problems induced by spatial ... Requires a first stage estimate of Pr(D=1 | Z) e.g. from a probit or logit regression on Z ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 38
Provided by: gib94
Category:

less

Transcript and Presenter's Notes

Title: GY460 Techniques of Spatial Analysis


1
GY460 Techniques of Spatial Analysis
Lecture 4 Techniques for dealing with spatial
sorting and selection(fixed effects,
diff-in-diff, matching and discontinuities etc.)
  • Steve Gibbons

2
Introduction
  • Sometimes we just want to eliminate problems
    induced by spatial sorting and heterogeneity
  • i.e. differences between places which may lead to
    confounding factors and biased estimates of
    relationships of interest
  • Selection (sorting) on observable and
    unobservable characteristics
  • Examples
  • Eliminating spatial factors from models of firm
    behaviour
  • Eliminating geographical influences from models
    of school quality
  • Various methods are available for dealing with
    this we have looked some of these already

3
Regression models with spatial effects
4
Data with discrete zones
  • N observations in the data
  • Grouped in to M zones (regions, districts,
    neighbourhoods)
  • E.g.
  • Cross-section data with gt1 cross-sectional
    observations in each neighbourhood
  • Or panel data with more than one time period for
    each neighbourhood

5
Spatial variation in the mean
  • Empirical model, with discrete neighbourhoods m
  • yim for observation i in place m, depends on
  • xim characteristics of observation i in place m
  • ?im unobserved factors for observation i in
    place m
  • um Unobserved factors common to all
    observations in place m
  • X-sectional case i cross-sectional units, m
    places
  • Panel data case i time units, mplaces

6
Random effects
  • Empirical model, with discrete neighbourhoods m
  • If um uncorrelated with xim, then OLS consistent
    just like spatial error model
  • Error terms ? are correlated within spatial
    groups m
  • But uncorrelated between spatial groups
  • Use GLS or ML (assuming normality) for efficient
    estimates and unbiased s.e.s (multi-level
    modelling)

7
Fixed area effects dummy variables
  • Empirical model, with discrete neighbourhoods m
  • If um correlated with xim, then OLS inconsistent.
  • Options
  • Estimate the area fixed effects using OLS
  • Least Squares Dummy variable model neighbourhood
    dummy variables

8
Fixed area effects within groups
  • Or within-groups transformation difference the
    variables from the neighbourhood mean
  • Where is the mean of y in group m
  • Eliminates um
  • Estimate by OLS
  • Only uses deviation of variables from
    neighbourhood means so only within-neighbourhood
    variation counts
  • LSDV and Within Groups (or (Fixed Effect)
    models are equivalent

9
Fixed area effects panel data
  • Even better information with repeated
    observations on panel units (individuals, firms,
    regions etc.) over time
  • Panel data
  • Now all relationships of interest can be
    estimated from variation within panel units over
    time
  • Use within-groups or first-differences over time,
    e.g.
  • Q what does (vt-vt-1 ) represent? How could you
    control for it? Then, what variation in the data
    allows us to estimate ??? Hence what do we
    assume, if ? is to be estimated consistently?

10
Dynamic panel data models
  • It would be useful to estimate this model e.g.
    to estimate the dependence of y on past values
    (or control for mean reversion)
  • Q Can this within group model be estimated
    consistently by OLS?
  • See Nickell (1981) Econometrica
  • What about the first differenced model?
  • Q Is there a useful IV here?

11
Dynamic panel data models
  • In principle you could use instruments for
  • This is the basis of the Arrelano Bond estimator
    (1991, Review of Economic Studies)
  • They develop a GMM estimator which weights the
    instruments taking into account the
    first-differenced error structure e.g.
    implemented in xtabond in STATA
  • Problems serial correlation in error terms?, if
    ? is close to zero the instruments will be very
    weak (since lagged values dont predict current
    values if ?0)
  • Can also use
    as instruments for
  • System GMM (Blundell and Bond 1998) xtabond2

12
Spatial panel data models
  • These look attractive e.g. to eliminate sorting
    i.e. u_i
  • But this still suffers from the simultaneity
    problems of the spatial y model requires
    maximum likelihood or instruments for
  • Also difficult to defend that there is spatial
    correlation, but no time-dynamics
  • So you have to estimate
  • Have to deal with time dynamic y and spatial y!

13
Spatial panel data models
  • Probably more useful to consider the reduced form
    e.g.

14
Difference in difference
15
Difference-in-difference
  • Suppose we have places, firms individuals i
    observed over time.
  • Treatment group D1 is exposed to some treatment
    x1,0 at time t1, whereas a control group D0 is
    not
  • There is selection into treatment group
    (EfD?0) and common time effects g

16
Difference-in-difference
  • The effect of the treatment can be estimated by a
    Difference in difference estimator
  • Note that this is the same as youd get from OLS
    on

17
Difference-in-difference
  • The DiD estimator is commonly used for evaluation
    of policy interventions
  • DiD doesnt work if the treatment and control
    groups have different time trends
  • If the composition of the treatment or control
    groups change before and after treatment e.g.

18
Matching
19
Matching estimators
  • Matching tries to do something similar, when
    treatment and control group are not both observed
    pre and post policy
  • Suppose we observe two groups
  • Suppose the goal is to estimate the Average
    effect of the Treatment on the Treated (ATT)
  • As we know, simple difference in means wont
    work
  • i.e. because the treated and non-treated would
    have different Y in the absence of treatment

20
Matching estimators
  • But suppose we have some observable
    characteristics Z for which
  • i.e. mean pre-treatment Y for individuals with
    characteristics Z is the same, whether or not
    they are in the treatment group
  • Called Conditional Independence Assumption CIA
  • Allows for selection into treated and non-treated
    groups by Z (selection on observables), but not
    by unobservables.
  • So if you can find individuals in group 0 who
    have the same Z as those in group 1 you can
    estimate from the
    individuals in group 0
  • If Z is discrete this is straightforward..

21
Matching estimators
  • So we can estimate
  • The naïve estimate of the effect of the treatment
    is 190-125 65

22
Matching estimators
  • For the treated, Y0 is unobserved but can be
    estimated by re-weighting (under the CIA
    assumption)
  • So the ATT is 190-180 10

23
Matching estimators
  • But what if (as is usual) Z is not discrete?
    Propensity score matching does this reweighting
    using an estimate of the probabilty that
    individual with characteristics z is in the
    treatment group
  • (Rosenbaum and Rubin (1983) Biometrika)
  • Requires a first stage estimate of Pr(D1 Z)
    e.g. from a probit or logit regression on Z
  • Then the treatment effect for an individual i in
    the treated group can be estimated as
  • Where the weights depend on the difference
    between the propensity score for individual i and
    the untreated controls j, and

24
Matching estimators
  • In practice Matching estimators behave like
    kitchen sink regressions you are just
    controlling for as many observable
    characteristics as possible (Z)
  • However, you are controlling for these Z in a
    very non-linear way like having lots of control
    variables and their interactions in an OLS
    regression
  • Matching estimators allow for heterogenous
    treatment effects
  • You can re-weight in other ways, e.g. to estimate
    the effect of the treatment on the population, or
    on the un-treated
  • No solution to selection on unobservables which
    is surely the main issue!
  • Requires common support no overlap between Z
    in the treated and untreated groups ? you cant
    match.

25
Discontinuity designs
26
Discontinuity designs
  • Regression discontinuity method tries to identify
    causal effects from abrupt changes
  • Requires a discontinuity induced by institutional
    rules, policy etc.
  • e.g. majority voting
  • Class size rules e.g. Maimonides rule
  • Geographical administrative boundaries
  • Assumption is that assignment to treatment is
    determined by some covariate X when it reaches a
    value d
  • The outcome is otherwise only related to X by a
    smooth function e.g. EyX m(X)

27
Discontinuity designs
28
Discontinuity designs
  • So
  • Idea is to estimate the average effect of the
    treatment at the discontinuity point
  • We could control for a m(x) parametrically
    (polynomial series etc.)
  • Or restrict the sample to observations for which
    x is close to c i.e.

29
Boundary discontinuities
School quality in district B
ve quality-price relationship across boundary
Price, homeowner characteristics
Price, homeowner characteristics
School quality in district A
Unobserved local amenity
30
Discontinuity designs
  • In principle, X is identical for treatment and
    controls exactly at the discontinuity
  • But practical applications require non-zero
    differences between X and discontinuity
  • E.g. can rarely find a large enough sample of
    housing transactions exactly on the boundary
  • Trade off between adequate sample size and
    elimination of biases due to m(x)
  • We looked at practical spatial examples e.g.
    Black (1999), Duranton et al (2006)
  • See also Gibbons, S., Machin, S and Silva, O.
    (2009), Valuing School Quality Using Boundary
    Discontinuity Regressions, SERC DP0018
    http//www.spatialeconomics.ac.uk/textonly/SERC/pu
    blications/download/sercdp0018.pdf

31
Applications to spatial policy evaluation
  • Research designs can incorporate elements of all
    these methods e.g. match treatment and control
    groups using propensity score matching, then
    implement dif in dif
  • Machin, S., McNally, S., Meghir,C. (2007),
    Resources and Standards in Urban Schools, IZA
    DP2653 http//ftp.iza.org/dp2653.pdf
  • Busso, M. and P. Kline (2006) Do Local Economic
    Development Programs Work, Evidence from Federal
    Empowerment Zone Program, http//www.econ.berkeley
    .edu/pkline/papers/Busso-Kline20EZ20(web).pdf
  • Romero, R. and M. Noble (2008) Evaluating
    Englands New Deal for Communities Programme
    Using the Difference in Difference method,
    Journal of Economic Geography 8(6) 1-20

32
The partial linear model
33
Continuous space
  • A general model with spatial heterogeneity
  • Si is an index of the location of observation i
  • Model continuous unobserved variation over space
  • m(.) is supposed to represent large-scale
    predictable variation over space e.g. land
    values
  • ? random shocks sales price of specific houses
  • We discussed these issues in the lecture on
    smoothing
  • Could do it parametrically e.g. polynomial series
    or Cheshire and Sheppard (1995) see earlier
    lectures

34
Partial linear model
  • Suppose
  • If we know ?, function m(.) is just the expected
    (mean) value of y-xb given the location s1, s2
  • Refer to the lecture on smoothing this can be
    inferred from values of y in neighbouring
    locations once we know ?
  • Spatial weighting again
  • Kernel weighting, nearest neighbours etc..

35
Semi-parametric spatial models
  • Must get estimates of beta first? How?
  • e.g. see Robinson (1988), Econometric, Root-n
    consistent Semiparametric Regression
  • Estimate averages of y and all x at each point in
    the data, non-parametrically
  • Estimate the betas by OLS on
  • Note analogy to the within-groups model
  • Can then estimate

36
Applications to housing analysis
  • Clapp, J. M., H.-J. Kim, and A. E. Gelfand
    (2002) "Predicting Spatial Patterns of House
    Prices Using Lpr and Bayesian Smoothing," Real
    Estate Economics, 30, (4), 505-532
  • Use of non-parametric methods to construct house
    price indices
  • Gibbons, S., and S. Machin (2003) "Valuing
    English Primary Schools," Journal of Urban
    Economics, 53, (2).
  • Use of the semi-parametric model for eliminating
    larger-scale neighbourhood effects on school
    performance

37
Conclusions
  • Underlying issue we have considered is selection
    or sorting e.g. people, firms etc of different
    types sort into different locations and this can
    lead to biased estimates of causal relationships
  • Selection can be on unobservables, or observables
  • We considered various techniques for dealing with
    these problems
  • Other solutions random assignment, IV we have
    or will consider elsewhere.
Write a Comment
User Comments (0)
About PowerShow.com