Advanced Econometrics Dr' Uma Kollamparambil - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Advanced Econometrics Dr' Uma Kollamparambil

Description:

Introduction to NLLS, maximum likelihood estimation and ... MLF estimate of a parameter is that value that maximizes the probability of the observed data. ... – PowerPoint PPT presentation

Number of Views:707
Avg rating:3.0/5.0
Slides: 36
Provided by: uampW
Category:

less

Transcript and Presenter's Notes

Title: Advanced Econometrics Dr' Uma Kollamparambil


1
Advanced EconometricsDr. Uma Kollamparambil
2
Today's Agenda
  • A quick round-up of basic econometrics
  • Introduction to NLLS, maximum likelihood
    estimation and limited dependent variable
    modelling

3
Regression analysis
  • Theory specifies the functional relationship
  • Measurement of relationship uses regression
    analysis to arrive at values of a and b.
  • Y a bX e
  • Components dependent independent variables,
    intercept (O), coefficients, error term
  • Regression may be simple or multivariate
    according to the no. of independent variables

4
Requirements
  • Model Specification relationship between
    dependent and independent variables
  • scatter plot
  • specify function that best fits the scatter
  • Sufficient Data for estimation
  • cross sectional
  • time series
  • panel

5
Some Important terminology
  • Least Squares Regression Y a bX e
  • Estimation
  • point estimate
  • Interval estimates
  • Inference
  • t-statistic
  • R-square or Coefficient of Determination
  • F-statistic

6
Estimation -- OLS
  • Ordinary Least Squares (OLS)
  • We have a set of datapoints, and want to fit a
    line to the data
  • The most efficient can be shown to be OLS.
    His minimizes the squared distance between the
    line and actual data points.

7
  • How to Estimate a and b in the linear equation?
  • The OLS estimator solves
  • This minimization problem can be solved
  • using calculus.
  • The result is the OLS estimators of a and b

8
Regression Analysis -- OLS
  • The basic equation
  • OLS estimator of b
  • OLS estimator of a

Here a hat denotes an estimator, and a bar a
sample mean.
9
Regression Analysis -- Inference
Here, the R-squared is a measure of the goodness
of fit of our model, while the standard deviation
of b gives us a measure of confidence for out
estimate of b.
the t-ratio. Combined with information in
critical values from a student-t distribution,
this ratio tells us how confident we are that a
value is significantly different from zero.
10
Analysis of Variance F ratio
  • F ratio tests the overall significance of the
    regression.
  • Tests the marginal contribution of new variable
  • Tests for structural change in data

11
Multivariate regression
12
Assumptions of OLS regression
  • Model is correctly specified is linear in
    parameters
  • X values are fixed in repeated sampling and y
    values are continuous stochastic
  • Each uiis normally distributed with mean of ui0
  • Equal variance ui (Homoscedasticity)
  • No autocorrelation or no correlation between ui
    and uj
  • Zero covariance between Xi and ui
  • No multicollinearity Cov(Xi Xj)0 , multivariate
    regression
  • Under assumption of CNLRM estimates are BLUE

13
Regression Analysis Some problems
  • Autocorrelation covariance between error terms
  • Identification DW d test 0-4 (near 2 indicates
    no autocorrelation)
  • R2 is overestimated
  • t and F tests misleading
  • Missed Variable Correctly specify
  • Consider AR scheme
  • Heteroscedasticity Non-constant variance
  • Detection scatter plot of error terms, park
    test, goldfeld-Quandt test, white test etc
  • t and F tests misleading
  • Remedial measures include transformation of
    variables thru WLS
  • Muticollinearity covariance between various X
    variables
  • Detection high R2 but t test insignificant, high
    pair-wise correlation between explanatory
    variables
  • t and F tests misleading
  • remove model over-specification, use pooled data,
    transform variables

14
Introducing non-linear regression
  • When linear regression model cannot adequately
    represent the relationships between variables,
    then the nonlinear regression model approach is
    appropriate. Eg Human Age and growth
  • Non linear in variables is considered linear
    regression
  • Non linear in Parameters is non-linear regression
    if transformation to linear form is not possible
  • Intrinsically linear transformation to linear
    form possible. Eg. Cobb-Douglas Production
    Function

Intrinsically non-lineartransformation to linear
form NOT possible. Eg. Exponential regression
model used to Measure growth of a Variable like
GDP
15
Estimating issues of NLR models
  • OLS not feasible bcos the first derivative
    normal equations attained on minimisation of u
    has unknowns on LHS RHS.
  • Therefore we use trial and error method by
    substituting values for b1b2 to estimate u
    iterative process

16
Methods of NLLS
  • Direct search or TrialError (derivative free
    method
  • Direct optimisation
  • Iterative Linearization method
  • Taylor series expansion
  • Gauss Newton iterative method
  • Newton-Raphson iterative method
  • Stata commands in tutorial 1

17
Limitations of NLLS
  • Identifying values of parameters that gives least
    errors through trial and error method is
    extremely difficult
  • When the number of parameters increase in the
    model it gets very complex to try out various
    values for each parameter.
  • Inference making is difficult in the context of
    small samples
  • Therefore instead of OLS, Maximum likelihood
    estimation method is suited for NLR Models

18
Maximum Likelihood estimation
  • An alternative to the least squares loss function
    is to maximize the likelihood function of the
    specific dependent variable values to occur in
    our sample, given the respective regression
    model.
  • MLE requires an assumption regarding the nature
    of distribution of the dependent variable.
  • The maximum likelihood estimate of a parameter is
    that value that maximizes the probability of the
    observed data.

19
MLE contd
  • PDF gives probability of data for given values of
    parameters
  • LF gives value of parameter for given values of
    data
  • MLF estimate of a parameter is that value that
    maximizes the probability of the observed data.
  • Maximising is done by calculus differentiation
  • MLE estimates parameter values and also provides
    tests of inference
  • If U is normal then the parameter values
    estimated through OLS and MLE are same, the
    variance estimates only converge over large
    samples.

20
Assume Y is normally distributed
21
Maximum Likelihood estimation
  • MLE consists in finding estimates for ?1 and ?2
    that are the most plausible, given the data we
    observe
  • most plausiblemeans that maximize the joint
    likelihood of our data
  • Likelihood is determined by the joint probability
    density function
  • MLE involves maximising the likelihood function
    wrt the parameters through differential calculas
  • Unique Solution maynot exist
  • In that case, the resulting normal equations are
    used for iterative process at different values of
    parameters to get various likelihood values
  • The parameters that maximises likelihood of
    obtaining the observed data is the maximum
    likelihood estimate

22
Qualitative response regression models
  • Models relationship between set of variables x
    and dependent variable which may be
  • Dichotomous/binary choice model (yes/no or
    fail/pass, or exporting/not exporting)
  • ordinal (ordered categorical variable like age
    group)
  • nominal (no ordering such as race, country )
  • Discrete (real numbers, but not continous eg no.
    of times you visited dentist, no. of accidents )
  • These models are not CLRM bcos it violates
    assumption of Y being a continuous variable
  • The u in such models will not be normally
    distributed
  • Using OLS makes it the Linear Probability Model

23
Linear Probability Model
  • Income House ownership relationship
  • E(YiXi)b1b2Xi
  • Y1 if house owned, Y0 otherwise
  • Pi(P(Yi1) (1-Pi)(P(Yi0)
  • Yi follows bernoulli probability distribution
  • E(Yi)1(Pi)0(1-Pi)Pi
  • E(YiXi)b1b2XiPi
  • Pi must lie between 0 and 1.

24
Problem with LPM
  • Ui not normally distributed inference making
    difficult in small samples
  • Heteroscedasticity error term follows bernoulli
    distribution, Var(ui)Pi(1-Pi)
  • Use WLS with square root of wiPi(1-Pi)
  • Nonfulfillment of 0ltE(Yi/Xi)lt1
  • Use constraints
  • R-sq not reliable measure of goodness of fit
  • Use count Rsq
  • Biggest problem is constant slope unrealistic
  • A non-linear model is therefore required

25
Graphically
graduation
1
ASVABC (ability score)
26
Non-linear Binary choice models
  • It is applicable when you want to predict
    qualitative variable like Predicting Probability
    of Export, Probability of Default etc.These
    models estimate probability given values of X and
    is thus known as probability models
  • Probability of success (export)
  • Pi(P(Yi1/Zi)
  • First issue MLE requires identifying the
    cumulative distribution function of the model

27
CDF best describes the S shaped curve of binary
model
Z
28
Logit Model
  • Probability distribution function of logistic
    distribution is represented as
  • The probability of success is gives by above.
    The probability of failure is given by
  • Thus as Z ranges from to infinity, P ranges
    from 0 to 1 and P is nonlinearly related to Z
    (ie. X), satisfying the requirements of a binary
    model.

29
Odds ratio
  • The ratio of the probability of success to
    probability of failure gives the odds ratio in
    favour of success.
  • Taking log of odds ratio, we are able to
    linearise the model
  • Slope coefficient here tells how the log-odds in
    favour of success change as X changes by 1 unit

30
Estimation of logit model
  • MLE estimation
  • Requires large samples
  • Instead of t statistic, z statistic is used to
    test individual significance of coefficient (Wald
    z test)
  • R2 is not meaningful, instead Count R2 is used
  • Count R2 no. of correct predictions/total no. of
    observations
  • To test null hypothesis that all slope
    coefficients are simultaneously equal to zero,
    instead of F we use Likelihood ratio (LR)
    statistic. LR stat follows chi2 distribution.
  • LRChi22(-Log likelihood UR- Log likelihood R)

31
Logit Estimation
  • Individual data (logit) MLE
  • Grouped data (glogit) observed probability is
    available for each value of X
  • OLS maybe used with weights to address
    heteroscedasticity. (WLS)
  • 1. Find observed probability obtain logit for
    each X as
  • Estimate with OLS
  • Note that estimation of transformed data is
    undertaken without constant
  • Use OLS inferences

32
Interpretation
  • Take Antilog of the slope coefficient, subtract 1
    from it and multiply result by 100 to get percent
    change in odds for 1 unit change in X.
  • Probability of success at given levels of X can
    be estimated by plugging in values of x and
    estimating z values then use formula
  • Rate of change of probability ofsucess with unit
    change in X variable is got by
  • Change is not linear but depends on levels of
    probability

33
eg. income and house ownership
  • Data on income(X), Nitotal families with income
    Xi, nifamilies with houses with Xi income
  • Pini/Ni LlnPi/(1-Pi) Lweighted L
  • For 1 unit increase in income, odds of owning
    house increases by 8.1
  • How to estimate probability when X20

34
Probability calculation
  • Plugging X20 in estimated equation we obtain
    L-0.09311, dividing by weight (4.1825) we get
    L -0.02226
  • Odds of owning house increases by 1.02 when
    income increases by 1 unit from 20
  • Probability of owning house at income20 is 49

35
Rate of change of probability
  • Rate of change is not constant and depends on the
    level of Income from where you are calculating
  • 0.07862(0.5056)(0.4944)
  • 0.01965
  • Rate of change of probability when income
    increases by one unit from the level of 20 units
    is 1.9
Write a Comment
User Comments (0)
About PowerShow.com