DifferencesinDifferences and A Brief Introduction to Panel Data - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

DifferencesinDifferences and A Brief Introduction to Panel Data

Description:

Differences-in-Differences. and A Brief Introduction to Panel Data. John Snow again... Sometimes different companies supplied households in same street ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 56
Provided by: Sunt151
Category:

less

Transcript and Presenter's Notes

Title: DifferencesinDifferences and A Brief Introduction to Panel Data


1
Differences-in-Differencesand A Brief
Introduction to Panel Data
2
John Snow again
3
The Grand Experiment
  • Water supplied to households by competing private
    companies
  • Sometimes different companies supplied households
    in same street
  • In south London two main companies
  • Lambeth Company (water supply from Thames Ditton,
    22 miles upstream)
  • Southwark and Vauxhall Company (water supply from
    Thames)

4
In 1853/54 cholera outbreak
  • Death Rates per 10000 people by water company
  • Lambeth 10
  • Southwark and Vauxhall 150
  • Might be water but perhaps other factors
  • Snow compared death rates in 1849 epidemic
  • Lambeth 150
  • Southwark and Vauxhall 125
  • In 1852 Lambeth Company had changed supply from
    Hungerford Bridge

5
What would be good estimate of effect of clean
water?
6
This is basic idea of Differences-in-Differences
  • Have already seen idea of using differences to
    estimate causal effects
  • Treatment/control groups in experimental data
  • Often would like to find treatment and
    control group who can be assumed to be similar
    in every way except receipt of treatment
  • This may be very difficult to do

7
A Weaker Assumption is..
  • Assume that, in absence of treatment, difference
    between treatment and control group is
    constant over time
  • With this assumption can use observations on
    treatment and control group pre- and
    post-treatment to estimate causal effect
  • Idea
  • Difference pre-treatment is normal difference
  • Difference pre-treatment is normal difference
    causal effect
  • Difference-in-difference is causal effect

8
A Graphical Representation
9
What is D-in-D estimate?
  • Standard differences estimator is AB
  • But normal difference estimated as CB
  • Hence D-in-D estimate is AC
  • Note assumes trends in outcome variables the
    same for treatment and control groups
  • This is not testable
  • with two periods can get no idea of plausibility
    but can with more periods

10
Some Notation
  • Define
  • µitE(yit)
  • Where i0 is control group, i1 is treatment
  • Where t0 is pre-period, t1 is post-period
  • Standard differences estimate of causal effect
    is estimate of
  • µ11-µ01
  • Differences-in-Differences estimate of causal
    effect is estimate of
  • (µ11-µ01)-(µ10-µ00)

11
How to estimate?
  • Can write D-in-D estimate as
  • (µ11-µ10)-(µ01 -µ00)
  • This is simply the difference in the change of
    treatment and control groups so can estimate as

12
  • This is simply differences estimator applied to
    the difference
  • To implement this need to have repeat
    observations on the same individuals
  • May not have this individuals observed pre- and
    post-treatment may be different
  • What can we do in this case?

13
In this case can estimate.
  • D-in-D estimate is estimate of ß3 why is this?

14
A Comparison of the Two Methods
  • Where have repeated observations could use both
    methods
  • Will give same parameter estimates
  • But will give different standard errors
  • levels version will assume residuals are
    independent unlikely to be a good assumption
  • Can deal with this by
  • Clustering
  • Or estimating differences version

15
Other Regressors
  • Can put in other regressors as before
  • Perhaps should think about way in which they
    enter the estimating equation
  • E.g. if level of W affects level of y then should
    include ?W in differences version

16
Differential Trends in Treatment and Control
Groups
  • Key assumption underlying validity of D-in-D
    estimate is that differences between treatment
    and control group would have remained constant in
    absence of treatment
  • Can never test this
  • With only two periods can get no idea of
    plausibility
  • But can with more than two periods

17
An ExampleVertical Relationships and
Competition in Retail Gasoline Markets, by
Justine Hastings, American Economic Review, 2004
  • Interested in effect of vertical integration on
    retail petrol prices
  • Investigates take-over in CA of independent
    Thrifty chain of petrol stations by ARCO (more
    integrated)
  • Defines treatment group as petrol stations which
    had a Thrifty within 1 mile
  • Control group those that did not
  • Lots of reasons why these groups might be
    different so D-in-D approach seems a good idea

18
This picture contains relevant information
  • Can see D-in-D estimate of 5c per gallon
  • Also can see trends before and after change very
    similar D-in-D assumption valid

19
A Case which does not look so good..Ashenfelters
Dip
  • Interested in effect of government-sponsored
    training (MDTA) on earnings
  • Treatment group are those who received training
    in 1964
  • Control group are random sample of population as
    a whole

20
Earnings for period 1959-69
21
Things to Note..
  • Earnings for trainees very low in 1964 as
    training not working in that year should ignore
    this year
  • Simple D-in-D approach would compare earnings in
    1965 with 1963
  • But earnings of trainees in 1963 seem to show a
    dip so D-in-D assumption probably not valid
  • Probably because those who enter training are
    those who had a bad shock (e.g. job loss)

22
Differences-in-DifferencesSummary
  • A very useful and widespread approach
  • Validity does depend on assumption that trends
    would have been the same in absence of treatment
  • Can use other periods to see if this assumption
    is plausible or not
  • Uses 2 observations on same individual most
    rudimentary form of panel data

23
A Brief Introduction to Panel Data
  • Panel Data has both time-series and cross-section
    dimension N individuals over T periods
  • Will restrict attention to balanced panels same
    number of observations on each individuals
  • Whole books written about but basics can be
    understood very simply and not very different
    from what we have seen before
  • Asymptotics typically done on large N, small T
  • Use yit to denote variable for individual i at
    time t

24
The Pooled Model
  • Can simply ignore panel nature of data and
    estimate
  • yitßxiteit
  • This will be consistent if E(eitxit)0 or
    plim(X e/N)0
  • But computed standard errors will only be
    consistent if errors uncorrelated across
    observations
  • This is unlikely
  • Correlation between residuals of same individual
    in different time periods
  • Correlation between residuals of different
    individuals in same time period (aggregate
    shocks)

25
A More Plausible Model
  • Should recognise this as model with group-level
    dummies or residuals
  • Here, individual is a group

26
Three Models
  • Fixed Effects Model
  • Treats ?i as parameter to be estimated (like ß)
  • Consistency does not require anything about
    correlation with xit
  • Random Effects Model
  • Treats ?i as part of residual (like ?)
  • Consistency does require no correlation between
    ?i and xit
  • Between-Groups Model
  • Runs regression on averages for each individual

27
Proposition 5.2The fixed effect estimator of ß
will be consistent if
  • E(eitxit)0
  • Rank(X,D)NK
  • Proof Simple application of what you should know
    about linear regression model

28
Intuition
  • First condition should be obvious regressors
    uncorrelated with residuals
  • Second condition requires regressors to be of
    full rank
  • Main way in which this is likely to fail in fixed
    effects model is if some regressors vary only
    across individuals and not over time
  • Such a variable perfectly multicollinear with
    individual fixed effect

29
Estimating the Fixed Effects Model
  • Can estimate by brute force - include separate
    dummy variable for every individual but may be
    a lot of them
  • Can also estimate in mean-deviation form

30
How does de-meaning work?
  • Can do simple OLS on de-meaned variables
  • STATA command is like
  • . xtreg y x, fe i(id)

31
Problems with fixed effect estimator
  • Only uses variation within individuals
    sometimes called within-group estimator
  • This variation may be small part of total (so low
    precision) and more prone to measurement error
    (so more attenuation bias)
  • Cannot use it to estimate effect of regressor
    that is constant for an individual

32
Random Effects Estimator
  • Treats ?i as part of residual (like ?)
  • Consistency does require no correlation between
    ?i and xit
  • Should recognise as like model with clustered
    standard errors
  • But random effects estimator is feasible GLS
    estimator

33
More on RE Estimator
  • Will not describe how we compute O-hat see
    Wooldridge
  • STATA command
  • . xtreg y x, re i(id)

34
Proposition 5.3The random effects estimator of
ß will be consistent if
  • E(eitxi1,..xit,.. xiT)0
  • E(?ixi1,..xit,.. xiT)0
  • Rank(XO-1X)k
  • Proof RE estimator a special case of the
    feasible GLS estimator so conditions for
    consistency are the same.
  • Error has two components so need a. and b.

35
Comments
  • Assumption about exogeneity of errors is stronger
    than for FE model need to assume eit
    uncorrelated with whole history of x this is
    called strong exogeneity
  • Assumption about rank condition weaker than for
    FE model e.g. can estimate effect variables that
    are constant for a given individual

36
Another reason why may prefer RE to FE model
  • If exogeneity assumptions are satisfied RE
    estimate will be more efficient than FE estimator
  • Application of general principle that imposing
    true restriction on data leads to efficiency
    gain.

37
Another Useful Result
  • Can show that RE estimator can be thought of as
    an OLS regression of
  • On
  • Where
  • This is sometimes called quasi-time demeaning
  • See Wooldridge (ch10, pp286-7) if want to know
    more

38
Between-Groups Estimator
  • This takes individual means and estimates the
    regression by OLS
  • Stata command is xtreg y x, be i(id)
  • Condition for consistency the same as for RE
    estimator
  • But BE estimator less efficient as does not
    exploit variation in regressors for a given
    individual
  • And cannot estimate variables like time trends
    whose average values do not vary across
    individuals
  • So why would anyone ever use it lets think
    about measurement error

39
Measurement Error in Panel Data Models
  • Assume true model is
  • Where x is one-dimensional
  • Assume E(eitxi1,..xit,.. xiT)0 and
    E(?ixi1,..xit,.. xiT)0 so that RE and BE
    estimators are consistent

40
Measurement Error Model
  • Assume
  • where uit is classical measurement error, xi is
    average value of x for individual i and ?it is
    variation around the true value which is assumed
    to be uncorrelated with and uit and iid.
  • We know this measurement error is likely to cause
    attenuation bias but this will vary between FE,
    RE and BE estimators.

41
Proposition 5.4
  • For FE model we have
  • For BE model we have
  • For RE model we have
  • Where

42
Proof
  • General idea is to write each model as an OLS
    estimator and then use what we know about
    attenuation bias in that model
  • Will use those earlier results

43
Proof for FE Model
  • Can write as OLS estimator on de-meaned data
  • We have that
  • And that

44
  • De-meaning we have that
  • And that
  • Take variances

45
  • Standard formula for attenuation bias gives us

46
Proof for BE estimator
  • Using earlier results we have

47
Proof for RE estimator (a bit more complicated)
  • Use result that can write it as OLS on
    quasi-de-meaned data
  • Attenuation bias van be written as

48
  • Can write these elements as
  • Leading to attenuation bias is
  • Where

49
What should we learn from this?
  • All rather complicated dont worry too much
    about details
  • But intuition is simple
  • Attenuation bias largest for FE estimator
    Var(x) does not appear in denominator FE
    estimator does not use this variation in data

50
  • Attenuation bias larger for RE than BE estimator
    as Tgt1gt?
  • The averaging in the BE estimator reduces the
    importance of measurement error.
  • Important to note that these results are
    dependent on the particular assumption about the
    measurement error process and the nature of the
    variation in xit things would be very different
    if measurement error for a given individual did
    not vary over time
  • But general point is the measurement error
    considerations could affect choice of model to
    estimate with panel data

51
Time Effects
  • Have treated time and individual dimensions
    asymmetrically no good reason for this
  • Errors likely to be correlated for different
    individuals in same time period most common way
    to deal with this is to include set of time
    dummies

52
Estimating Fixed Effects Model in Differences
  • Can also get rid of fixed effect by differencing

53
Comparison of two methods
  • Estimate parameters by OLS on differenced data
  • If only 2 observations then get same estimates as
    de-meaning method
  • But standard errors different
  • Why? assumption about autocorrelation in
    residuals

54
What Are these assumptions?
  • For de-meaned model
  • For differenced model
  • These are not consistent

55
This leads to time series
  • Which is better depends on which assumption is
    right how can we decide this?
  • Allow cross-section dimension to wither
  • Focus on case where observations are a time
    series for a single unit e.g. a whole economy.
Write a Comment
User Comments (0)
About PowerShow.com