DifferencesinDifferences and A Brief Introduction to Panel Data - PowerPoint PPT Presentation

1 / 48

About This Presentation

Title:

DifferencesinDifferences and A Brief Introduction to Panel Data

Description:

Difference pre-treatment is ... have this individuals observed pre- and post-treatment may be ... perfectly multicollinear with individual fixed effect ... – PowerPoint PPT presentation

Number of Views:99

Avg rating:3.0/5.0

Slides: 49

Provided by: sunt6

Category:

more less

Transcript and Presenter's Notes

Title: DifferencesinDifferences and A Brief Introduction to Panel Data

1
Differences-in-Differencesand A Brief
Introduction to Panel Data
2
John Snow again
3
The Grand Experiment

Water supplied to households by competing private
companies
Sometimes different companies supplied households
in same street
In south London two main companies
Lambeth Company (water supply from Thames Ditton,
22 miles upstream)
Southwark and Vauxhall Company (water supply from
Thames)

4
In 1853/54 cholera outbreak

Death Rates per 10000 people by water company
Lambeth 10
Southwark and Vauxhall 150
Might be water but perhaps other factors
Snow compared death rates in 1849 epidemic
Lambeth 150
Southwark and Vauxhall 125
In 1852 Lambeth Company had changed supply from
Hungerford Bridge

5
What would be good estimate of effect of clean
water?
6
This is basic idea of Differences-in-Differences

Have already seen idea of using differences to
estimate causal effects
Treatment/control groups in experimental data
Often would like to find treatment and
control group who can be assumed to be similar
in every way except receipt of treatment
This may be very difficult to do

7
A Weaker Assumption is..

Assume that, in absence of treatment, difference
between treatment and control group is
constant over time
With this assumption can use observations on
treatment and control group pre- and
post-treatment to estimate causal effect
Idea
Difference pre-treatment is normal difference
Difference post-treatment is normal difference
causal effect
Difference-in-difference is causal effect

8
A Graphical Representation
9
What is D-in-D estimate?

Standard differences estimator is AB
But normal difference estimated as CB
Hence D-in-D estimate is AC
Note assumes trends in outcome variables the
same for treatment and control groups
This is not testable
with two periods can get no idea of plausibility
but can with more periods

10
Some Notation

Define
µitE(yit)
Where i0 is control group, i1 is treatment
Where t0 is pre-period, t1 is post-period
Standard differences estimate of causal effect
is estimate of
µ11-µ01
Differences-in-Differences estimate of causal
effect is estimate of
(µ11-µ01)-(µ10-µ00)

11
How to estimate?

Can write D-in-D estimate as
(µ11-µ10)-(µ01 -µ00)
This is simply the difference in the change of
treatment and control groups so can estimate as

This is simply differences estimator applied to
the difference
To implement this need to have repeat
observations on the same individuals
May not have this individuals observed pre- and
post-treatment may be different
What can we do in this case?

13
In this case can estimate.

D-in-D estimate is estimate of ß3 why is this?

14
A Comparison of the Two Methods

Where have repeated observations could use both
methods
Will give same parameter estimates
But will give different standard errors
levels version will assume residuals are
independent unlikely to be a good assumption
Can deal with this by
Clustering
Or estimating differences version

15
Other Regressors

Can put in other regressors as before
Perhaps should think about way in which they
enter the estimating equation
E.g. if level of W affects level of y then should
include ?W in differences version

16
Differential Trends in Treatment and Control
Groups

Key assumption underlying validity of D-in-D
estimate is that differences between treatment
and control group would have remained constant in
absence of treatment
Can never test this
With only two periods can get no idea of
plausibility
But can with more than two periods

17
An ExampleVertical Relationships and
Competition in Retail Gasoline Markets, by
Justine Hastings, American Economic Review, 2004

Interested in effect of vertical integration on
retail petrol prices
Investigates take-over in CA of independent
Thrifty chain of petrol stations by ARCO (more
integrated)
Defines treatment group as petrol stations which
had a Thrifty within 1 mile
Control group those that did not
Lots of reasons why these groups might be
different so D-in-D approach seems a good idea

18
This picture contains relevant information

Can see D-in-D estimate of 5c per gallon
Also can see trends before and after change very
similar D-in-D assumption valid

19
A Case which does not look so good..Ashenfelters
Dip

Interested in effect of government-sponsored
training (MDTA) on earnings
Treatment group are those who received training
in 1964
Control group are random sample of population as
a whole

20
Earnings for period 1959-69
21
Things to Note..

Earnings for trainees very low in 1964 as
training not working in that year should ignore
this year
Simple D-in-D approach would compare earnings in
1965 with 1963
But earnings of trainees in 1963 seem to show a
dip so D-in-D assumption probably not valid
Probably because those who enter training are
those who had a bad shock (e.g. job loss)

22
Differences-in-DifferencesSummary

A very useful and widespread approach
Validity does depend on assumption that trends
would have been the same in absence of treatment
Can use other periods to see if this assumption
is plausible or not
Uses 2 observations on same individual most
rudimentary form of panel data

23
A Brief Introduction to Panel Data

Panel Data has both time-series and cross-section
dimension N individuals over T periods
Will restrict attention to balanced panels same
number of observations on each individuals
Whole books written about but basics can be
understood very simply and not very different
from what we have seen before
Asymptotics typically done on large N, small T
Use yit to denote variable for individual i at
time t

24
The Pooled Model

Can simply ignore panel nature of data and
estimate
yitßxiteit
This will be consistent if E(eitxit)0 or
plim(X e/N)0
But computed standard errors will only be
consistent if errors uncorrelated across
observations
This is unlikely
Correlation between residuals of same individual
in different time periods
Correlation between residuals of different
individuals in same time period (aggregate
shocks)

25
A More Plausible Model

Should recognise this as model with group-level
dummies or residuals
Here, individual is a group

26
Three Models

Fixed Effects Model
Treats ?i as parameter to be estimated (like ß)
Consistency does not require anything about
correlation with xit
Random Effects Model
Treats ?i as part of residual (like ?)
Consistency does require no correlation between
?i and xit
Between-Groups Model
Runs regression on averages for each individual

27
Proposition 5.2The fixed effect estimator of ß
will be consistent if

E(eitxit)0
Rank(X,D)NK
Proof Simple application of what you should know
about linear regression model

28
Intuition

First condition should be obvious regressors
uncorrelated with residuals
Second condition requires regressors to be of
full rank
Main way in which this is likely to fail in fixed
effects model is if some regressors vary only
across individuals and not over time
Such a variable perfectly multicollinear with
individual fixed effect

29
Estimating the Fixed Effects Model

Can estimate by brute force - include separate
dummy variable for every individual but may be
a lot of them
Can also estimate in mean-deviation form

30
How does de-meaning work?

Can do simple OLS on de-meaned variables
STATA command is like
. xtreg y x, fe i(id)

31
Problems with fixed effect estimator

Only uses variation within individuals
sometimes called within-group estimator
This variation may be small part of total (so low
precision) and more prone to measurement error
(so more attenuation bias)
Cannot use it to estimate effect of regressor
that is constant for an individual

32
Random Effects Estimator

Treats ?i as part of residual (like ?)
Consistency does require no correlation between
?i and xit
Should recognise as like model with clustered
standard errors
But random effects estimator is feasible GLS
estimator

33
More on RE Estimator

Will not describe how we compute O-hat see
Wooldridge
STATA command
. xtreg y x, re i(id)

34
Proposition 5.3The random effects estimator of
ß will be consistent if

E(eitxi1,..xit,.. xiT)0
E(?ixi1,..xit,.. xiT)0
Rank(XO-1X)k
Proof RE estimator a special case of the
feasible GLS estimator so conditions for
consistency are the same.
Error has two components so need a. and b.

35
Comments

Assumption about exogeneity of errors is stronger
than for FE model need to assume eit
uncorrelated with whole history of x this is
called strong exogeneity
Assumption about rank condition weaker than for
FE model e.g. can estimate effect variables that
are constant for a given individual

36
Another reason why may prefer RE to FE model

If exogeneity assumptions are satisfied RE
estimate will be more efficient than FE estimator
Application of general principle that imposing
true restriction on data leads to efficiency
gain.

37
Another Useful Result

Can show that RE estimator can be thought of as
an OLS regression of
On
Where
This is sometimes called quasi-time demeaning
See Wooldridge (ch10, pp286-7) if want to know
more

38
Between-Groups Estimator

This takes individual means and estimates the
regression by OLS
Stata command is xtreg y x, be i(id)
Condition for consistency the same as for RE
estimator
But BE estimator less efficient as does not
exploit variation in regressors for a given
individual
And cannot estimate variables like time trends
whose average values do not vary across
individuals
So why would anyone ever use it lets think
about measurement error

39
Measurement Error in Panel Data Models

Assume true model is
Where x is one-dimensional
Assume E(eitxi1,..xit,.. xiT)0 and
E(?ixi1,..xit,.. xiT)0 so that RE and BE
estimators are consistent

40
Measurement Error Model

Assume
where uit is classical measurement error, xi is
average value of x for individual i and ?it is
variation around the true value which is assumed
to be uncorrelated with and uit and iid.
We know this measurement error is likely to cause
attenuation bias but this will vary between FE,
RE and BE estimators.

41
Proposition 5.4

For FE model we have
For BE model we have
For RE model we have
Where

42
What should we learn from this?

All rather complicated dont worry too much
about details
But intuition is simple
Attenuation bias largest for FE estimator
Var(x) does not appear in denominator FE
estimator does not use this variation in data

Attenuation bias larger for RE than BE estimator
as Tgt1gt?
The averaging in the BE estimator reduces the
importance of measurement error.
Important to note that these results are
dependent on the particular assumption about the
measurement error process and the nature of the
variation in xit things would be very different
if measurement error for a given individual did
not vary over time
But general point is the measurement error
considerations could affect choice of model to
estimate with panel data

44
Time Effects

Have treated time and individual dimensions
asymmetrically no good reason for this
Errors likely to be correlated for different
individuals in same time period most common way
to deal with this is to include set of time
dummies

45
Estimating Fixed Effects Model in Differences