Title: Modeling Consumer Decision Making and Discrete Choice Behavior
1(No Transcript)
2Econometrics in Health Economics Discrete
Choice ModelingandFrontier Modeling and
Efficiency EstimationProfessor William
GreeneStern School of BusinessNew York
UniversitySeptember 2-4, 2007
3Discrete Choices
- Observed outcomes
- Inherently discrete
- Number of occurrences (e.g., family size)
- Behavior drug use, smoking behavior
- Implicitly continuous, censored
- The observed data are discrete by construction
(e.g., revealed preferences - Discrete decisions that reveal underlying
preferences - Implicit censoring mechanism
- Implications to be considered
- For model building
- For analysis and prediction of behavior
4Modeling Discrete Choice
- Theoretical foundations
- Econometric methodology
- Models
- Statistical bases
- Econometric methods
- Applications
5Discrete Choice Modeling
- Random Utility Models
- Binary Choice Modeling
- Extensions
- Heterogeneity
- Semiparametrics
- Panel Data
6Two Fundamental Building Blocks
- Underlying Behavioral Theory Random utility
model The link between underlying behavior and
observed data - Empirical Tool Stochastic, parametric model for
binary choice a platform for models of discrete
choice
7Behavioral Assumptions
- Utility is defined over alternatives, j
1,,J(i,t) - U(i,j,t) is a preference ordering that exists for
individual i in choice situation t for
alternative j. - Preferences are transitive and complete wrt
choice situations - Utility maximization assumption
- If U(i,1,t) gt U(i,2,t), the individual
will choose alternative 1, not alternative 2. - Revealed preference (duality). If the consumer
chooses alternative 1 and not alternative 2, then
U(i,1,t) gt U(i,2,t).
8Indirect Utility Functions
- Utility(i)x U(x) defined over consumption
choices - Utility maximization subject to budget
constraints produces x x(Income,Prices) - Indirect utility V(Income,Prices)
- Observability heterogeneity produces indirect
utility - V(i,t,j) V(Income,Prices, Age,Educ,Sex,)
- Unobservable heterogeneity produces random
utility U(i,t,j) V(Income,Prices,
Characteristics) e
9Random Indirect Utility Functions
U(i,j,t) ?j ?ix(i,t,j) ?izit
?ijt
?j Choice specific constant xitj
Attributes of choice presented to person,
such as Price ?i Person specific taste
weights zit Characteristics of the person
(age,income) ?i Weights on person specific
characteristics ?ijt Unobserved random
component of utility
10Application
- 210 Commuters Between Sydney and Melbourne
- Available modes Air, Train, Bus, Car
- Observed
- Choice
- Attributes Cost, terminal time, travel time,
other - Characteristics Household income
- First application Fly or Other
11A Formal Model for Binary Choice
- Yes or No decision
- Example, choose to fly or not to fly to a
destination when there - are alternatives.
- Model Net utility of flying
- Ufly ??1Cost ?2Time ?Income ?
- Choose to fly if net utility is positive
- Net utility UFLY UNOT FLY
- Data x 1,cost,terminal time,travel time
- z income
- y 1 if choose fly, Ufly gt 0, 0 if
not.
12An Econometric Model
- Choose to fly iff UFLY gt UOTHER 0 (Normalize)
- Ufly ??1Cost ?2Time ?Income ?
- Ufly gt 0 ? ? gt -(??1Cost ?2Time
?Income) - Probability model For any person observed by
the analyst, - Prob(fly)Prob? gt -(??1Cost ?2Time
?Income) - Note the relationship between the unobserved ?
and the outcome
13 Binary Choice Data
Choose Air Gen.Cost Travel Time
Income 1.0000 86.000 25.000
70.000 .00000 67.000 69.000
60.000 .00000 77.000 64.000
20.000 .00000 69.000 69.000
15.000 .00000 77.000 64.000
30.000 .00000 71.000 64.000
26.000 .00000 58.000 64.000
35.000 .00000 71.000 69.000
12.000 .00000 100.00 64.000
70.000 1.0000 158.00 30.000
50.000 1.0000 136.00 45.000
40.000 1.0000 103.00 30.000
70.000 .00000 77.000 69.000
10.000 1.0000 197.00 45.000
26.000 .00000 129.00 64.000
50.000 .00000 123.00 64.000 70.000
14A Case for Randomness of Utility
- Does GC1 lt GC2 ? will always choose choice 1?
Apparently not - Does Income explain the difference?
- Apparently not
Choose Air Gen.Cost Travel Time
Income 1.0000 86.000 25.000
70.000 .00000 67.000 69.000 60.000
Choose Air Gen.Cost Travel Time Income
.00000 100.00 64.000 70.000 1.0000
158.00 30.000 50.000
15What Can We Learn from the Data?
- Are the attributes important?
- Aggregate predictions Total Demand
- Value of time
16Implied Demand Curve
- Expected Demand for Flights As
- So, we can obtain a downward sloping demand
17Value of Time
- We can also compute the value of time as
- If the direct cost measure is unavailable, use
the negative of the income coefficient.
(Numerator will generally be negative.)
18Econometric Frameworks
- Nonparametric
- Semiparametric
- Parametric
- Classical (Sampling Theory)
- Bayesian
- (We will focus on classical inference methods)
19Modeling Approaches
- Nonparametric Relationship
- Minimal Assumptions
- Minimal Conclusions
- Semiparametric Index function
- Stronger assumptions
- Robust to model misspecification
(heteroscedasticity) - Still weak conclusions
- Parametric Probability and index function
- Strongest assumptions complete specification
- Strongest conclusions
- Possibly less robust. (Not necessarily)
20Nonparametric Not Very Informative
P(Air)f(Income)
21Semiparametric Approaches
- Maximum Score
- Find b so that
- Si sign(bxi) sign(yi) is maximized
- Maximize the number of observations for which
bxi lt 0 when y 0 and bxi gt 0 when yi 1. - Questions(1) What do the coefficients
mean?(2) If b is a solution, Kb is a solution
for any K gt 0. See question (1).
(Solution is scaled so bb 1.) - (3) Is inference possible? (Apparently
not Abrevaya)
22MSCORE
23Semiparametric Approaches
- Klein and Spady Kernel Based
24Klein and Spady Semiparametric
Note necessary normalizations. Coefficients are
not very meaningful.
25Likelihood Based Inference Methods
Behavioral Theory
Likelihood Function
Statistical Theory
Observed Measurement
The likelihood function embodies the theoretical
description of the population. Characteristics of
the population are inferred from the
characteristics of the likelihood function.
(Bayesian and Classical)
26Parametric Logit Model
27Logit vs. MScore
- Logit fits worse
- MScore fits better, coefficients
are meaningless
28Parametric Model Estimation
- How to estimate ?, ?1, ?2, ??
- Its not regression
- The technique of maximum likelihood
- Proby1
- Prob? gt -(??1Cost ?2Time ?Income)
- Proby0 1 - Proby1
- Requires a model for the probability
29Completing the Model F(?)
- The distribution
- Normal PROBIT, natural for behavior
- Logistic LOGIT, allows thicker tails
- Gompertz EXTREME VALUE, asymmetric, underlies
the basic logit model for multiple choice - Does it matter?
- Yes, large difference in estimates
- Not much, quantities of interest are more stable.
30Application Doctor Visits(No Attributes of the
Choices)
German Health Care Usage Data, 7,293 Individuals,
Varying Numbers of PeriodsVariables in the file
areData downloaded from Journal of Applied
Econometrics Archive. This is an unbalanced panel
with 7,293 individuals. They can be used for
regression, count models, binary choice, ordered
choice, and bivariate binary choice. Â This is a
large data set. Â There are altogether 27,326
observations. Â The number of observations ranges
from 1 to 7. Â (Frequencies are 11525, 22158,
3825, 4926, 51051, 61000, 7987). Note, the
variable NUMOBS below tells how many observations
there are for each person. This variable is
repeated in each row of the data for the person.Â
(Downlo0aded from the JAE Archive)
DOCTOR 1(Number of doctor visits gt 0)
HSAT Â health satisfaction, coded
0 (low) - 10 (high) Â DOCVIS
 number of doctor visits in last three months
HOSPVIS Â number of hospital
visits in last calendar year
PUBLIC Â insured in public health insurance 1
otherwise 0 ADDON Â insured
by add-on insurance 1 otherswise 0
HHNINC Â household nominal monthly net
income in German marks / 10000.
(4 observations with income0 were dropped)
HHKIDS children under age 16 in
the household 1 otherwise 0
EDUC Â years of schooling
AGE age in years MARRIED
marital status
31Estimated Binary Choice (Probit) Model
---------------------------------------------
Binomial Probit Model
Dependent variable DOCTOR
Number of observations 27326
Log likelihood function -17670.94
Info. Criterion AIC 1.29378
Info. Criterion BIC 1.29559
Restricted log likelihood -18019.55
McFadden Pseudo R-squared .0193462
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability Constant .15500247
.05651561 2.743 .0061 HHNINC
-.11643121 .04632875 -2.513 .0120
.35208362 HHKIDS -.14118362
.01821758 -7.750 .0000 .40273000 EDUC
-.02811531 .00350266 -8.027 .0000
11.3206310 AGE .01283460
.00079035 16.239 .0000 43.5256898 MARRIED
.05226039 .02046202 2.554 .0106
.75861817
32Estimated Binary Choice Models
LOGIT PROBIT EXTREME
VALUE Constant 0.155002 0.251115
0.560723 HHNINC -0.116431 -0.185922 -0.140951 HHK
IDS -0.141184 -0.22947 -0.182789 EDUC -0.0281153
-0.0455878 -0.035887 AGE 0.0128346 0.0207086
0.016202 MARRIED 0.0522604 0.085293
0.068080 Log-L -17670.9 -17673.1 -17679.5 Log-L(0
) -18019.6 -18019.6 -18019.6
33Index??1Income ?2Educ ?Age
34Effect on Predicted Probability of an Increase in
Income
??1Income ?2Educ ?(Age1)
(? is positive)
35Marginal Effects in Probability Models
- ProbOutcome some F(??X)
- Partial effect ? F(??X) / ?x
- (derivative)
- Partial effects are derivatives
- Result varies with model
- Logit ? F(??Age) / ?x Prob (1-Prob)
?AGE - Probit ? F(??Age) / ?x Normal density ?AGE
- Scaling usually erases model differences
36The Delta Method For Computing Standard Errors
37Marginal Effects for Binary Choice
38Marginal Effect for a Dummy Variable
- Probyi 1xi,di F(?xi?di)
- conditional mean
- Marginal effect of d
- Probyi 1xi,di1 - Probyi 1xi,di0
- Logit
39Estimated Marginal Effects
Estimate Standard Error t Ratio
P Value PROBIT
HHNINC -.04388304
.01746073 -2.513 .0120 EDUC
-.01059669 .00132014 -8.027 .0000 AGE
.00483737 .00029767 16.251
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05341443
.00691172 -7.728 .0000 MARRIED
.01978313 .00777809 2.543
.0110 LOGIT
HHNINC -.04321347
.01744584 -2.477 .0132 EDUC
-.01059587 .00131215 -8.075 .0000 AGE
.00481326 .00029819 16.142
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05359813
.00692332 -7.742 .0000 MARRIED
.01993604 .00782159 2.549
.0108 Extreme Value
HHNINC -.04067557
.01667101 -2.440 .0147 EDUC
-.01035617 .00124994 -8.285 .0000 AGE
.00467547 .00029841 15.668
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05417190
.00697888 -7.762 .0000 MARRIED
.02018367 .00790414 2.554 .0107
40Computing Effects
- Compute at the data means?
- Simple
- Inference is well defined
- Average the individual effects
- More appropriate?
- Asymptotic standard errors.
- Is hypothesis testing about marginal effects
meaningful?
41Average Partial Effects
42Krinsky and Robb Method
43Partial Effects for a Probit Model
----------------------------------------------
-------- Variable Coefficient Standard
Error b/St.Er.PZgtz ----------------------
-------------------------------- Krinsky and
Robb Method With 100 Replications Using Average
Partial Effects HHNINC -.04318266
.01619878 -2.666 .0077 HHKIDS
-.05225173 .00678277 -7.704 .0000
EDUC -.01032588 .00131932 -7.827
.0000 AGE .00474397 .00027718
17.115 .0000 MARRIED .01894322
.00792639 2.390 .0169 Delta Method Using
Partial Effects at Means HHNINC -.04388304
.01746073 -2.513 .0120 HHKIDS
-.05341443 .00691172 -7.728 .0000
EDUC -.01059669 .00132014 -8.027
.0000 AGE .00483737 .00029767
16.251 .0000 MARRIED .01978313
.00777809 2.543 .0110
44Elasticities
- Elasticity
- How to compute standard errors?
- Delta method
- Bootstrap
- Bootstrap the individual elasticities? (Will
neglect variation in parameter estimates.) - Bootstrap model estimation?
45Income Elasticity Krinsky and Robb
46How Well Does the Model Fit?
- There is no R squared
- Fit measures computed from log L
- pseudo R squared 1 logL0/logL
- Others - these do not measure fit.
- Direct assessment of the effectiveness of the
model at predicting the outcome
47Fit Measures for Binary Choice
- Likelihood Ratio Index
- Bounded by 0 and 1
- Rises when the model is expanded
- Cramer (and others)
48 Fit Measures for the ProbitModel
---------------------------------------- Fit
Measures for Binomial Choice Model Probit
model for variable DOCTOR -----------------
----------------------- Proportions P0
.370892 P1 .629108 N 27326 N0
10135 N1 17191 LogL -17670.942
LogL0 -18019.552 Estrella
1-(L/L0)(-2L0/n) .02544 -------------------
--------------------- Efron McFadden
Ben./Lerman .02448 .01935
.54500 Cramer Veall/Zim. Rsqrd_ML
.02492 .04374 .02519
----------------------------------------
Information Akaike I.C. Schwarz I.C.
Criteria 1.29378 1.29559
---------------------------------------- ----
--------------------------------------------------
--- Predictions for Binary Choice Model.
---------------------------------
---------------------- Actual
Predicted Value Value
0 1 Total Actual
-------------------------------------------
----------- 0 367 ( 1.3) 9768 (
35.7) 10135 ( 37.1) 1 387 (
1.4) 16804 ( 61.5) 17191 (
62.9) --------------------------------------
---------------- Total 754 ( 2.8)
26572 ( 97.2) 27326 (100.0) ---------------
---------------------------------------
Pseudo R-squared
49Predicting the Outcome
- Predicted probabilities
- P F(a b1Age b2Educ cIncome)
- Predicting outcomes
- Predict y1 if P is large
- Use 0.5 for large (more likely than not)
- Generally, use
- Count successes and failures
50Aggregate Prediction is a Useful Way to Assess
the Importance of a Variable
-------------------------------------------------
-------- Predictions for Binary Choice Model.
Predicted value is 1 when probability is
greater than .500000, 0 otherwise. -----------
-------------------------------------------- Ac
tual Predicted Value
Value 0 1
Total Actual ------------------------------
------------------------ 0 351 (
1.3) 9784 ( 35.8) 10135 ( 37.1) 1
409 ( 1.5) 16782 ( 61.4) 17191 (
62.9) --------------------------------------
---------------- Total 760 ( 2.8)
26566 ( 97.2) 27326 (100.0) ---------------
--------------------------------------- ------
------------------------------------------------
- Actual Predicted Value
Value 0 1
Total Actual ------------------------
------------------------------ 0 367
( 1.3) 9768 ( 35.7) 10135 ( 37.1) 1
387 ( 1.4) 16804 ( 61.5) 17191 (
62.9) --------------------------------------
---------------- Total 754 ( 2.8)
26572 ( 97.2) 27326 (100.0) ---------------
---------------------------------------
Model fit without Income
Model fit with Income has 238 more correct
predictions
51Hypothesis Tests
- Restrictions Linear or nonlinear functions of
the model parameters - Structural change Constancy of parameters
- Specification Tests Heteroscedasticity, model
specification (distribution)
52Hypothesis Testing Conventional Neyman/Pearson
- Comparisons of Likelihood Functions Likelihood
Ratio Tests - Distance Measures Wald Statistics
- Lagrange Multiplier Tests
53Likelihood Ratio Tests
- Null hypothesis restricts the parameter vector
- Alternative releases the restriction
- Test statistic Chi-squared
- 2(LogLUnrestricted model
- LogLRestricted model) gt 0
- Degrees of freedom number of restrictions
54Wald Test
- Unrestricted parameter vector is estimated
- Discrepancy m Rb q (or r(b,q) if nonlinear)
is computed - Variance of discrepancy is estimated
- Wald Statistic is mVar(m)-1m
55Lagrange Multiplier Test
- Restricted model is estimated
- Derivatives of unrestricted model and variances
of derivatives are computed at restricted
estimates - Wald test of whether derivatives are zero tests
the restrictions - Usually hard to compute difficult to program
the derivatives and their variances.
56Hypothesis Tests Results
- LIKELIHOOD RATIO
- LRTEST 88.766777
- WALD
- Matrix WALDSTAT has 1
- rows and 1 columns.
- 1
- --------------
- 1 89.26382
- LAGRANGE MULTIPLIER
- ---------------------------------------------
- Binary Logit Model for Binary Choice
- Dependent variable DOCTOR
- Number of observations 27326
- LM Stat. at start values 89.88971
- LM statistic kept as scalar LMSTAT
- ---------------------------------------------
Testing the hypothesis that the coefficients on
the income and education variables are equal to
zero in the logit model.
57Testing Structural Stability
- Fit the same model in each of K subsamples
- Unrestricted log likelihood is the sum of the
subsample log likelihoods LogL1 - Pool the subsamples, fit the model to the pooled
sample - Restricted log likelihood is that from the pooled
sample LogL0 - Chi-squared 2(LogL1 LogL0) degrees of
freedom (K-1)model size.
58A Test of Structural Stability
- (Application to be examined later) Liberal arts
college has gender economics course? - Covariates constant, size of economics faculty,
academic affiliation, religious affiliation - Data from 4 U.S. regions, West, North, South,
midwest. - Is the same model appropriate for all 4 regions?
59Application Men vs. Women Model for Doctor
Probit forfemale1 Lhs Doctor Rhs X
Log likelihood function -7855.219 Probit
forfemale0 Lhs Doctor Rhs X Log
likelihood function -9541.066 Probit
Lhs Doctor Rhs X Log
likelihood function -17670.94 2LogL(Femal
e) LogL(Male) LogL(Pooled) -----------------
------------------- Listed Calculator Results
------------------------------------
Result 549.310000 (Chi squared with 6 D.F.)
60Scaling
- Uitj ?j ?i xitj ?izit ?ijt
- ?ijt Unobserved random component of utility
- Mean E?ijt 0, Var?ijt 1
- Mean 0 is innocent. Why assume variance 1?
- What if there are subgroups with different
variances? - Cost of ignoring the between group variation?
- Specifically modeling
- More general heterogeneity across people
- Cost of the homogeneity assumption
- Modeling issues
61Heteroscedasticity in Binary Choice Models
- Random utility Yi 1 iff ?xi ?i gt 0
- Resemblance to regression How to accommodate
heterogeneity in the random unobserved effects
across individuals? - Heteroscedasticity different scaling
- Parameterize Var?i exp(?zi)
- Reformulate probabilities
- Probit
- Partial effects are now very complicated
62Heteroscedasticity in Marginal Effects
- For the univariate case
- Eyixi,zi Fßxi / exp(?zi)
- ? Eyixi,zi /?xi fßxi / exp(?zi)
ß - ? Eyixi,zi /?zi
- fßxi / exp(?zi) ? - ßxi /
exp(?zi) ? - If the variables are the same in x and z, these
are added. Sign and magnitude are ambiguous
63Testing For Heteroscedasticity
- Likelihood Ratio, Wald and Lagrange Multiplier
Tests are all straightforward - All tests require a specification of the model of
heteroscedasticity - There is no generic test for heteroscedasticity
without a specific model
64Robust Estimation
- There is no heteroscedasticity robust (White)
covariance estimator. - Robust (semiparametric) parameter estimators do
not allow further analysis. - Only ratios of coefficients are estimable
- Probabilities and partial effects cannot be
computed. (Scaling is not accounted for.)
65Heteroscedasticity in the Doctor Equation
---------------------------------------------
Binomial Probit Model
Dependent variable DOCTOR
Log likelihood function -17496.19
Log likelihood function -17670.94
(Restricted model. LR 349.5 w/ 2 DF) LM Stat.
at start values 313.4050 (Computed
separately) -------------------------------------
-------- -------------------------------------
--------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ------------------------------------
---------------------------- ---------Index
function for probability Constant .06472667
.01268180 5.104 .0000 HHNINC
-.01170328 .00554442 -2.111 .0348
.35208362 HHKIDS -.01356948
.00462470 -2.934 .0033 .40273000 EDUC
-.00084257 .00051971 -1.621 .1050
11.3206310 AGE -.00030092
.00014827 -2.030 .0424 43.5256898 (Note
negative coefficient) MARRIED .00610723
.00268916 2.271 .0231
.75861817 ---------Variance function AGE
-.03914159 .00629100 -6.222 .0000
43.5256898 (Note larger negative coefficient)
FEMALE -.77274469 .05529956 -13.974
.0000 .47877479 (Highly significant.) -----
-------------------------------------- Partial
derivatives of Ey F with respect to
the vector of characteristics. ----------------
--------------------------- HHNINC
-.03554999 .01374920 -2.586 .0097
HHKIDS -.04121878 .00714739 -5.767
.0000 EDUC -.00255939 .00127739
-2.004 .0451 AGE .00350153
.00349942 1.001 .3170 MARRIED
.01855137 .00586656 3.162
.0016 ---------Variance function AGE
.00350153 .00349942 1.001 .3170 (Note
positive marginal effect!) FEMALE
.08717426 .04486398 1.943 .0520
(Insignificant?)
66Choice Based Sampling
- Sample estimator (MLE) mimics the sample
- MLE assumes the sample mimics the population
- If the sample is nonrepresentative of the
population, the MLE will be also. - Choice based samples
- Sample is biased
- Certain outcomes (choices) are over- or
undersampled - Estimator (MLE) will produce estimates that mimic
this bias.
67Choice Based Sample for Transport Mode
68Weighting and Choice Based Sampling
- Weighted log likelihood for all data types
- Endogenous weights for individual data
- Biased sampling Choice Based
69Choice Based Sampling Correction
- Maximize Weighted Log Likelihood
- Covariance Matrix Adjustment
- V H-1 G H-1 (all three weighted)
- H Hessian
- G Outer products of gradients
70Effect of Choice Based Sampling
Unweighted ------------------------------------
-------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ Constant 1.784582594
1.2693459 1.406 .1598 GC
.2146879786E-01 .68080941E-02 3.153 .0016
TTME -.9846704221E-01 .16518003E-01 -5.961
.0000 HINC .2232338915E-01 .10297671E-01
2.168 .0302 --------------------------------
------------- Weighting variable
CBWT Corrected for Choice Based
Sampling ------------------------------
--------------- ------------------------------
-------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ Constant 1.014022236
1.1786164 .860 .3896 GC
.2177810754E-01 .63743831E-02 3.417 .0006
TTME -.7434280587E-01 .17721665E-01 -4.195
.0000 HINC .2471679844E-01 .95483369E-02
2.589 .0096
71Panel Data Treatments
- Pooling and robust estimation
- Clustering corections
- Panel estimators
- Random effect
- Fixed effects
- Modeling heterogeneity
- Common effects
- Random parameters
- Mixed models
- Latent class models
72Panel Data Application
- Did firm i produce a product or process
innovation in year t ? yit 1Yes/0No - Observed N1270 firms for T5 years, 1984-1988
- Observed covariates xit Industry, competitive
pressures, size, productivity, etc. - How to model?
- Binary outcome
- Correlation across time
- Heterogeneity across firms
73Application
74Cluster Effects in Panel and Stratified Data
- What do we mean by this?
- Clustering is with respect to the dependent
variable - Clustering is with respect to unobserved
effects in the model - Clustering with respect to independent
variables is irrelevant and should be ignored. - Correction is with respect to the covariance
matrix, not the estimator - Is the robust covariance matrix robust? To
what? - What assumptions are needed for the correction
to work? The pooled estimator must be consistent!
75Cluster Correction
76(No Transcript)
77Fixed and Random Effects in Regression
- yit ai bxit eit
- Random effects Two step FGLS. First step is OLS
- Fixed effects OLS based on group mean
differences - Neither works (even approximately) if the model
is nonlinear. - How do we proceed for a binary choice model
- yit ai bxit eit
- yit 1 if yit gt 0, 0 otherwise.
78Panel Data and Binary Choice Models
- Random Utility Model for Binary Choice
- Uit ? ?xit ?it Person i
specific effect - Fixed effects using dummy variables
- Uit ?i ?xit ?it
- Random effects using omitted heterogeneity
- Uit ? ?xit (?it vi)
- Same outcome mechanism Yit Uit gt 0
79Fixed and Random Effects Models
- Fixed Effects
- Robust to both specifications
- Inconvenient to compute (many parameters)
- Incidental parameters problem
- Random Effects
- Inconsistent if effects are correlated with X
- Small(er) number of parameters
- Easier (?) to compute
- Computation available estimators
- Other Approaches to Modeling Heterogeneity
80Random Effects
- Uit ? ?xit (?it ?v vi)
- Logit model (can be generalized)
- Joint probability for individual i vi
- Unobserved component vi must be eliminated
- Maximize wrt ?, ? and ?v
- How to do the integration?
- Analytic integration Integral does not exist in
closed form - Quadrature most familiar software
- Simulation
81Quadrature Butler and Moffitt
82Estimation by Simulation
is the sum of the logs of EPr(y1,y2,vi). Can
be estimated by sampling vi and averaging. (Use
random numbers.)
83Estimated Random Effects Models
---------------------------------------------
Random Effects Binary Probit Model
Log likelihood function -16273.96
Restricted log likelihood -17670.94
Unbalanced panel has 7293 individuals.
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- Constant .03411277
.09635399 .354 .7233 HHNINC
-.00317550 .06667150 -.048 .9620
.35208362 HHKIDS -.15378566
.02704366 -5.687 .0000 .40273000 EDUC
-.03369428 .00628888 -5.358 .0000
11.3206310 AGE .02014296
.00131894 15.272 .0000 43.5256898 MARRIED
.01632531 .03134693 .521 .6025
.75861817 Rho .44789069
.01020965 43.869 .0000 ---------------------
------------------------ Random Coefficients
Probit Model Log likelihood
function -16279.97 PROBIT (normal)
probability model Simulation based
on 50 Halton draws --------------------
------------------------- ---------Means for
random parameters Constant .03329051
.06322876 .527 .5985 ---------Nonrandom
parameters HHNINC -.00297343
.05201195 -.057 .9544 .35208362 HHKIDS
-.15357945 .02028593 -7.571 .0000
.40273000 EDUC -.03348872
.00393143 -8.518 .0000 11.3206310 AGE
.02007864 .00090132 22.277 .0000
43.5256898 MARRIED .01682560
.02277150 .739 .4600 .75861817 ---------
Scale parameters for dists. of random
parameters Constant .90088375
.01126251 79.990 .0000 RHO .900883752 /
1 .900883752 .447997
Butler/Moffitt Quadrature
Simulation with 50 Halton Points
84Ignoring Unobserved Heterogeneity
85The Effect of Ignoring Random EffectsLogit
Coefficient Estimates
logit lhs doctor rhs x mar
pds_groupti ran --------------------------
-------------------------------------- Variab
le Coefficient Standard Error
b/St.Er.PZgtz Mean of X -----------------
----------------------------------------------
- Random Effects
Constant -.13460475
.17764130 -.758 .4486 HHNINC
.02191356 .11865884 .185 .8535
HHKIDS -.21598299 .04773805 -4.524
.0000 EDUC -.06357790 .01132182
-5.616 .0000 AGE .03926718
.00246587 15.924 .0000 MARRIED
.02507118 .05628204 .445 .6560 Rho
.41607571 .00583916 71.256
.0000 Pooled
Constant
.25111543 .09113537 2.755 .0059
HHNINC -.18592232 .07506403 -2.477
.0133 .35208362 HHKIDS -.22947000
.02953694 -7.769 .0000 .40273000 EDUC
-.04558783 .00564646 -8.074
.0000 11.3206310 AGE .02070863
.00128517 16.114 .0000 43.5256898 MARRIED
.08529305 .03328573 2.562 .0104
.75861817
The cluster estimator does not fix this.
86The Effect of Ignoring Random EffectsMarginal
Effects
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtzElasticity ---
----------------------------------------------
--------------- Random Effects
HHNINC
.00358351 .01940382 .185 .8535
.00198108 EDUC -.01039686
.00184906 -5.623 .0000 -.18480732 AGE
.00642134 .00040269 15.946 .0000
.43885160 ---------Marginal effect for dummy
variable is P1 - P0. HHKIDS -.03544814
.00786141 -4.509 .0000 -.02241578
MARRIED .00410498 .00922645 .445
.6564 .00488969 Pooled
HHNINC -.04321347 .01744584 -2.477
.0132 -.02405262 EDUC -.01059587
.00131215 -8.075 .0000 -.18962896 AGE
.00481326 .00029819 16.142
.0000 .33119369 ---------Marginal effect for
dummy variable is P1 - P0. HHKIDS
-.05359813 .00692332 -7.742 .0000
-.03412409 MARRIED .01993604
.00782159 2.549 .0108 .02390890
87Fixed Effects
- Dummy variable coefficients
- Uit ?i ?xit ?it
- Can be done by brute force for 10,000s of
individuals - F(.) appropriate probability for the observed
outcome - Compute ? and ?i for i1,,N (may be large)
- See Estimating Econometric Models with Fixed
Effects at www.stern.nyu.edu/wgreene
88Models with Fixed Individual Effects
- Additive Effects
- Log Likelihood Function
- Approach
- Conditional estimation based on sufficient
statistics - Unconditional, brute force with all dummy
variables
89Conditional Estimation
- Principle f(yi1,yi2, some statistic) is free
of the fixed effects for some models. - Maximize the conditional log likelihood, given
the statistic.
90Conditional Logit Model
91Conditional Estimation
- Other Distributions?
- Poisson the leading nonbinary case
- Loglinear Exponential
- Almost no others
- Estimating constants is still a problem if
marginal effects or probabilities are desired
92Example Two Period Binary Logit
93Binary Logit, cont.
- Estimate ? by maximizing conditional logL
- Estimate ?i by using the known ? in the FOC for
the unconditional logL - Solve for the N constants, one at a time treating
? as known. - No solution when yit sums to 0 or Ti
94Wooldridge on Estimating Partial Effects
- The fixed effects logit estimator of ?
immediately gives us the effect of each element
of xi on the log-odds ratio Unfortunately, we
cannot estimate the partial effects unless we
plug in a value for ai. Because the distribution
of ai is unrestricted in particular, Eai is
not necessarily zero it is hard to know what to
plug in for ai. In addition, we cannot estimate
average partial effects, as doing so would
require finding E?(xit ? ai), a task that
apparently requires specifying a distribution for
ai.
95Logit Constant Terms
96Unconditional Estimation
- Maximize the whole log likelihood
- Difficult! Many (thousands) of parameters.
- Feasible NLOGIT (2001) (Brute force)
97Unconditional Estimator
---------------------------------------------
FIXED EFFECTS Logit Model
Log likelihood function -9458.638
Unbalanced panel has 7293 individuals.
Bypassed 3046 groups with inestimable a(i).
LOGIT (Logistic) probability model
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability HHNINC -.06097272
.17828658 -.342 .7324 .35357827 HHKIDS
-.08840685 .07439887 -1.188 .2347
.43906021 EDUC -.11670836
.06674866 -1.748 .0804 11.3554798 AGE
.10475175 .00725480 14.439 .0000
43.0477999 MARRIED -.05731835
.10608750 -.540 .5890 .77178591 --------
----------------------------------- Partial
derivatives of Ey F with respect to
the vector of characteristics. They are
computed at the means of the Xs. Estimated
Eymeans,mean alphai .612 Estimated
scale factor for dE/dx .237
-------------------------------------------
HHNINC -.01447714 .04208235 -.344
.7308 .35357827 ---------Marginal effect
for binary independent variable HHKIDS
-.02099100 .00415463 -5.052 .0000
.43906021 EDUC -.02771081
.02037808 -1.360 .1739 11.3554798 AGE
.02487187 .00420872 5.910 .0000
43.0477999 ---------Marginal effect for binary
independent variable MARRIED -.01360946
.00469760 -2.897 .0038 .77178591
98Conditional Estimator
-------------------------------------------------
- Panel Data Binomial Logit Model
Number of individuals 7293
Number of periods
_GROUPTI Conditioning event is the
sum of DOCTOR ------------------------
-------------------------- Log likelihood
function -6299.016
--------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ----------------------
-------------------------------- HHNINC
-.05038297 .15887796 -.317 .7512
HHKIDS -.07776425 .06628241 -1.173
.2407 EDUC -.09081577 .05667292
-1.602 .1091 AGE .08475971
.00650217 13.036 .0000 MARRIED
-.05207227 .09304411 -.560
.5757 -------------------------------------------
Partial derivatives of probabilities with
(Constant terms estimated individually respect
to the vector of characteristics. after
estimation of beta by Chamberlain They are
computed at the means of the Xs. method)
Observations used are All Obs.
------------------------------------------- --
-------Marginal effect for variable in
probability HHNINC -.01205716
.03807980 -.317 .7515 -.00703544 HHKIDS
-.01860977 .01891717 -.984 .3252
-.34915032 EDUC -.02173314
.01390241 -1.563 .1180 -.03601829 AGE
.02028386 .00310658 6.529 .0000
1.46317734 ---------Marginal effect for dummy
variable is P1 - P0. MARRIED -.00314259
.00566690 -.555 .5792 -.00209750
99Advantages of the FE Model
- Allows correlation of effect and regressors
- Fairly straightforward to estimate
- Simple to interpret
100Disadvantages of FE
- Not necessarily simple to estimate if very large
samples (Stata just creates the thousands of
dummy variables) - The incidental parameters problem Small T bias.
101Incidental Parameters Problems Conventional
Wisdom
- General Biased in samples with fixed T except
in special cases such as linear or Poisson
regression - Specific Upward bias (experience with probit
and logit) in estimators of ?
102What We Know About the IP Problem in Binary
Choice Models
- Andersen, Hsiao, Abrevaya (Exact Analytic)
- Bias in logit estimator is exactly 100 when T
2 - No result ever obtained for any other model for
T2 - Heckman (Nonreplicable Monte Carlo study)
- Bias in probit estimator is small if T ? 8
- Bias in probit estimator is toward 0 in some
cases - Katz (et al numerous others), Greene
- Bias in probit and logit estimators is large
- Upward bias persists even as T ? 20
103Some Familiar Territory A Monte Carlo Study of
the FE Estimator Probit vs. Logit
Estimates of Coefficients and Marginal Effects at
the Implied Data Means
Results are scaled so the desired quantity being
estimated (?, ?, marginal effects) all equal 1.0
in the population.
104Specification Tests RE vs. FE
- Fixed effects vs. Random effects
- Unconditional FE estimator is never consistent
(if T is small) - RE is inconsistent if FE applies
- Cannot use Hausman test
- Effects vs. no effects
- Conditional FE estimator is always consistent
- Pooled estimator is consistent if no effects
- Can use Hausman test for this specification test
is for common effects vs. no effects, not fixed
effects vs. no effects.
105Dynamic Models
106Dynamic Probit Model A Standard Approach
107Bias Reduction
- Dynamic binary choice
- yit 1(ßxit dyi,t-1ci gt 0)
- Fixed or random effects
- Common fixed effects
- yit 1(ßxit ci gt 0)
- Presumed proportional bias plim b kß
- Estimate the proportionality constant, k.
- Literature is in its infancy
- How do we know b is proportionally biased?
- Known analytic results apply to trivial models
- Not yet useful for practitioners
108Ordered Outcomes
- E.g. Taste test, credit rating, course grade
- Underlying random preferences Mapping to
observed choices - Strength of preferences
- Censoring and discrete measurement
- The nature of ordered data
109Modeling Ordered Choices
- Random Utility
- Uit ? ?xit ?izit ?it
- ait ?it
- Observe outcome j if utility is in region j
- Probability of outcome probability of cell
- PrYitj F(?j ait) - F(?j-1 ait)
110Application Health Care Usage
German Health Care Usage Data, 7,293 Individuals,
Varying Numbers of PeriodsVariables in the file
areData downloaded from Journal of Applied
Econometrics Archive. This is an unbalanced panel
with 7,293 individuals. They can be used for
regression, count models, binary choice, ordered
choice, and bivariate binary choice. Â This is a
large data set. Â There are altogether 27,326
observations. Â The number of observations ranges
from 1 to 7. Â (Frequencies are 11525, 22158,
3825, 4926, 51051, 61000, 7987). Note, the
variable NUMOBS below tells how many observations
there are for each person. This variable is
repeated in each row of the data for the person.Â
(Downlo0aded from the JAE Archive)
DOCTOR 1(Number of doctor visits gt 0)
HOSPITAL 1(Number of hospital
visits gt 0) HSAT Â health
satisfaction, coded 0 (low) - 10 (high) Â
DOCVIS Â number of doctor visits in
last three months HOSPVIS Â
number of hospital visits in last calendar year
PUBLIC Â insured in public
health insurance 1 otherwise 0
ADDON Â insured by add-on insurance 1
otherswise 0 HHNINC Â
household nominal monthly net income in German
marks / 10000. (4
observations with income0 were dropped)
HHKIDS children under age 16 in the
household 1 otherwise 0
EDUC Â years of schooling
AGE age in years MARRIED
marital status EDUC years of
education
111Health Care Satisfaction (HSAT)
Self Administered Survey Health Care
Satisfaction? (0 10)
Continuous Preference Scale
112Ordered Probabilities
113Four Ordered Probabilities
0 - ßx
µ1 - ßx
µ2 - ßx
8 - ßx
-8 - ßx
y0 y1 y2
y3
114Coefficients
115Effects in the Ordered Probability Model
Assume the ßk is positive. Assume that xk
increases. ßx increases. µj- ßx shifts to the
left for all 4 cells. Proby0
decreases Proby1 decreases the mass shifted
out is larger than the mass shifted in. Proby2
increases same reason. Proby3 increases.
When ßk gt 0, increase in xk decreases Proby0
and increases ProbyJ. Intermediate cells are
ambiguous, but there is only one sign change in
the marginal effects from 0 to 1 to to J
116Ordered Probability Model for Health Satisfaction
---------------------------------------------
Ordered Probability Model
Dependent variable HSAT
Number of observations 27326
Underlying probabilities based on Normal
Cell frequencies for outcomes Y
Count Freq Y Count Freq Y Count Freq 0
447 .016 1 255 .009 2 642 .023 3
1173 .042 4 1390 .050 5 4233 .154 6
2530 .092 7 4231 .154 8 6172 .225 9
3061 .112 10 3192 .116
---------------------------------------------
----------------------------------------------
-------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
----------------------- Index
function for probability Constant
2.61335825 .04658496 56.099 .0000
FEMALE -.05840486 .01259442 -4.637
.0000 .47877479 EDUC .03390552
.00284332 11.925 .0000 11.3206310 AGE
-.01997327 .00059487 -33.576
.0000 43.5256898 HHNINC .25914964
.03631951 7.135 .0000 .35208362
HHKIDS .06314906 .01350176 4.677
.0000 .40273000 Threshold
parameters for index Mu(1) .19352076
.01002714 19.300 .0000 Mu(2)
.49955053 .01087525 45.935 .0000 Mu(3)
.83593441 .00990420 84.402
.0000 Mu(4) 1.10524187 .00908506
121.655 .0000 Mu(5) 1.66256620
.00801113 207.532 .0000 Mu(6)
1.92729096 .00774122 248.965 .0000
Mu(7) 2.33879408 .00777041 300.987
.0000 Mu(8) 2.99432165 .00851090
351.822 .0000 Mu(9) 3.45366015
.01017554 339.408 .0000
117Ordered Probability Effects
-------------------------------------------------
--- Marginal effects for ordered probability
model M.E.s for dummy variables are
Pryx1-Pryx0 Names for dummy
variables are marked by .
-----------------------------------------------
----- ---------------------------------------
--------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ----------------------------------
--------------------------------
These are the effects on ProbY00 at means.
FEMALE .00200414 .00043473 4.610
.0000 .47877479 EDUC -.00115962
.986135D-04 -11.759 .0000 11.3206310 AGE
.00068311 .224205D-04 30.468
.0000 43.5256898 HHNINC -.00886328
.00124869 -7.098 .0000 .35208362
HHKIDS -.00213193 .00045119 -4.725
.0000 .40273000 These are the
effects on ProbY01 at means. FEMALE
.00101533 .00021973 4.621 .0000
.47877479 EDUC -.00058810
.496973D-04 -11.834 .0000 11.3206310 AGE
.00034644 .108937D-04 31.802
.0000 43.5256898 HHNINC -.00449505
.00063180 -7.115 .0000 .35208362
HHKIDS -.00108460 .00022994 -4.717
.0000 .40273000 ... repeated for all 11
outcomes These are the effects on
ProbY10 at means. FEMALE -.01082419
.00233746 -4.631 .0000 .47877479 EDUC
.00629289 .00053706 11.717
.0000 11.3206310 AGE -.00370705
.00012547 -29.545 .0000 43.5256898
HHNINC .04809836 .00678434 7.090
.0000 .35208362 HHKIDS .01181070
.00255177 4.628 .0000 .40273000
118Ordered Probit Marginal Effects
119Panel Data Treatments FE
- No sufficient statistics ? No conditional
estimator - Unconditional (brute force) is straightforward
- Transformed model
- Prob(yit gt j, t1,,TiXi) is an ordered binary
choice model - Produces multiple estimators of ß
- Reconcile with minimum distance
120Transformed Model
Fit this model for each j 1,,J as a fixed or
random effects binary choice model. Each has its
own j specific constant term (random effects) or
estimates of (ai µj) and its own specific
vector ßj. Reconcile the multiple slope vectors
with a minimum distance weighted average. (In
both cases, only the part of ßj that is not the
constant term.) (Note This is a way to get a
consistent estimator in the presence of fixed
effects. It is not needed for random effects.)
121Fixed Effects Estimates
---------------------------------------------
FIXED EFFECTS OrdPrb Model
Number of observations 27326
Iterations completed 6
Log likelihood function -41876.93
Number of parameters 5680
Unbalanced panel has 7293 individuals.
Bypassed 1626 groups with inestimable a(i).
LHS variable values 0,1,...,10
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability AGE -.07137568
.00273513 -26.096 .0000 43.9209856 HHNINC
.30140100 .06919695 4.356 .0000
.35112607 EDUC .02405894
.02649654 .908 .3639 11.3100525 HHKIDS
-.05493925 .02766401 -1.986 .0470
.40921377 MU(1) .32485866
.02036400 15.953 .0000 MU(2)
.84476382 .02736032 30.876 .0000
MU(3) 1.39396202 .03002635 46.425
.0000 MU(4) 1.82292900 .03101930
58.768 .0000 MU(5) 2.69905222
.03227934 83.615 .0000 MU(6)
3.12710904 .03273884 95.517 .0000
MU(7) 3.79215966 .03344847 113.373
.0000 MU(8) 4.84341077 .03489769
138.789 .0000 MU(9) 5.57238334
.03629754 153.520 .0000
Time invariant variable FEMALE is dropped from
the fixed effects model.
122Generalizing the Ordered Probit
- INDEX ?xit
- Thresholds
- Standard model ?-1 -?, ?00, ?j gt ?j-1 gt 0.
- Homogeneous preference scale.
- Generalized (Pudney/Shields, JAE 00, Job
Grades) -
- Note identification problem -
?xit. If any variables are common to the two
parts, the coefficients are not identified. - Harris, Zhao, Greene (2004 Drug Use )
123Multivariate Binary Choice Models
- Bivariate Probit Models
- Analysis of bivariate choices
- Marginal effects
- Prediction
- Simultaneous Equations and Recursive Models
- A Sample Selection Bivariate Probit Model
- The Multivariate Probit Model
- Specification
- Simulation based estimation
- Inference
- Partial effects and analysis
- The panel probit model
124Gross Relation Between Two Binary Variables
- Cross Tabulation Suggests Presence or Absence of
a Bivariate Relationship
-------------------------------------------------
------------------------ Cross Tabulation
Row variable is DOCTOR (Out of range 0-49
0) Number of Rows 2
(DOCTOR 0 to 1)
Col variable is HOSPITAL (Out of range 0-49
0) Number of Cols
2 (HOSPITAL 0 to 1)
Chi-squared independence tests
Chi-squared
1 430.11235 Prob value .00000
G-squared 1 477.27393 Prob
value .00000
-----------------------------------------------
-------------------------- Joint Frequencies
for Row Variable DOCTOR Column Variable
HOSPITAL ---------------------------------
-------------------------------------- DOCTOR
Total 0 1
------------------------------
-----------------------------------------
0 10135 9715 420
1 17191 15216
1975
---------------------------------------------
-------------------------- Total 27326
24931 2395
-----------------------------------------
------------------------------
125Tetrachoric Correlation
http//ourworld.compuserve.com/homepages/jsuebersa
x/tetra.htm http//www2.chass.ncsu.edu/garson/pa7
65/correl.htm
126Estimating Tetrachoric Correlation
- Numerous ad hoc algorithms suggested in the
literature - Do not appear to have noticed the connection to a
bivariate probit model - Maximum likelihood estimation is simple under the
(necessary) assumption of normality
127Likelihood Function
128Estimation
---------------------------------------------
FIML Estimates of Bivariate Probit Model
Maximum Likelihood Estimates
Dependent variable DOCHOS
Weighting variable None
Number of observation