Title: Methods of Economic Investigation: Lent Term: First Six Weeks
1Methods of Economic Investigation Lent Term
First Six Weeks
- Rajeev Dehejia
- Office Hour Monday 11.00-12.00
- Office R451
2Administrative Details
- 3 lectures per week for first 6 weeks all at
10am - Monday, 10-11,
- Tuesday,10-11
- Thursday,10-11
3Lecture discipline
- Lectures are optional -- you dont have to come.
- But if you do choose to come
- Arrive on time.
- Pay attention in class.
- Things to do
- Ask questions!
- Things not to do
- Gossip about last night.
- Tell jokes (good or bad).
- Check e-mail, Facebook, news, the web.
- Text message (or talk on the phone!).
- Listen to music (good or bad).
4What is Econometrics For?
-
- To make life miserable for MSc students?
- To impress your mother with the magic of
idempotent matrices? - To provide credible answers to interesting
questions?
5Econometrics is a means to an end not an end in
itself.
- Two different types of ends (may be others)
- Causal Effects
- Forecasting
- Causal effects are answers to what if
questions - What would happen to smoking if cigarette taxes
were raised? - Forecasting just want best currently available
predictors dont worry about causality
6Emphasis on means to an end
- Recommended texts Wooldridge Intermediate
Econometrics (not very technical),
Cross-Sectional Econometrics (more advanced). - Class exercises will contain practical work with
real data. - Number of purposes
- Makes concepts less abstract, easier to
understand. - Gives real-world skills.
- Gives insight into frustrations of empirical
work - Cute theory
- Fantastic econometric methodology
- Take it to the data and.
7How to Estimate Causal Effects?
- Want Effect of X on distribution of y, other
relevant things being held constant - Most common to be interested in effect on mean of
y, i.e.
8Estimation of linear regression offers promising
approach
- Can interpret regression function (Xß) as
estimate of E(yX) - If conditional expectation linear in X then exact
- If conditional expectation non-linear then Xß
linear approximation to true function - This is same as
9Proposition 1.1 If E(yX)Xß, the OLS estimate
is an unbiased estimate of ß
- Proof Can write OLS estimator as
- If X is fixed we have that
10Problems with Inferring Causal Effects from
Regressions
- Regressions tell us about correlations but
correlation is not causation - Example Regression of whether currently have
health problem on whether have been in hospital
in past year - HEALTHPROB Coef. Std. Err. t
---------------------------------------------
PATIENT .262982 .0095126
27.65 _cons .153447 .003092
49.63 - Do hospitals make you sick? a causal effect
11General Problems in Estimating Causal Effects
- Omitted Variables
- Reverse Causality
- Measurement Error
- Sample selection
12Omitted Variables (should be familiar)
- Suppose we want to estimate E(yX,W) assumed to
be linear in (X,W), so that E(yX,W) XßW? or - y XßW?e
- But you estimate
- yXßu
- i.e. E(yX). Will have
13Form of Omitted Variables Bias
- Where there is only one variable
- Extent of omitted variables bias related to
- - size of correlation between X and W
- - strength of relationship between y and W
14In hospital example
- Prior health status an obvious omitted variable
- HEALTHPROB Coef. Std. Err. t
- --------------------------------------------
- PATIENT .1250091 .0078147 16.00
- HEALTHPROB1 .6282796 .0061896 101.51
- _cons .0554544 .0026937 20.59
15Reverse Causality/ Endogeneity
- Idea is that correlation between y and X may be
because it is y that causes X not the other way
round - Interested in causal model
- yXße
- But also causal relationship in other direction
- Xayu
16- Reduced form is
- X(uae)/(1-aß)
- X correlated with e know this leads to bias in
OLS estimates - In hospital example being sick causes you to go
to hospital not clear what good solution is.
17Measurement Error
- Most (all?) of our data are measured with error.
- Suppose causal model is
- yXße
- But only observe X which is X plus some error
- XXu
- Classical measurement error
- E(uX)0
18- Can write causal relationship as
- YXß-u ß e
- Note that X correlated with composite error
- Should know this leads to bias/ inconsistency in
OLS estimator - Can make some useful predictions about nature of
bias later on in course - Want E(yX) but can only estimate E(yX)
19Selection Effects
- Following regression seems to show that women
with young children earn more than those with
older children - LOGWAGE Coef. Std. Err. t
- ---------------------------------------
- AGEKID0 .0942016 .0083255 11.31
- AGEKID1 .1333421 .008284 16.10
- AGEKID2 .0833223 .0084401 9.87
- AGEKID3 .0526896 .0087102 6.05
- AGEKID4 .019879 .0087995 2.26
- _cons 1.808458 .0061696 293.12
- Is this sensible? probably not
20- One explanation is sample selection
- Only have earnings data on women who work
- Women with small children who work tend to have
high earnings (e.g. to pay for childcare) - Employment rates of mothers with babies is 28,
of those with 5-year olds is 50
21Why is this? A brief exposition
- Causal model for everyone
- yXß e
- But only observe if work, W1, so estimate
E(yX,W1) not E(yX) - Sample selection bias if W correlated with e
this is likely - Heckman got Nobel prize for working out how to
deal with this but not part of this course
22Common Features of Problems
- All problems have an expression in everyday
language omitted variables, reverse causality
etc - All have an econometric form the same one
- A correlation of X with the error
23How To Surmount the Problems?
- More sophisticated econometric methods than OLS
e.g. IV - Better data Griliches
- since it is the badness of the data that
provides us with our living, perhaps it is not at
all surprising that we have shown little interest
in improving it
24But Recent Trends
- Much more emphasis on good quality data and
research design than statistical fixes the
credibility revolution - Probably started in labour economics but now
arriving in most fields - Will illustrate this in course through
wide-ranging examples
25Internal and External Validity
- Estimates have internal validity if conclusions
valid for population being studied - Estimates have external validity if conclusions
valid for other populations e.g. can generalise
impact of class size reduction in Tennessee in
late 1980s to class size reduction in UK in 2005
nothing in data will help with this
26Choosing your data..
- Suppose interested in causal effect of X on y.
- Can choose the way in which X is determined in
your sample - may seem fanciful but field experiments becoming
more common in economics - Good reason to choose to do randomized controlled
experiment