Linear Regression Models - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Linear Regression Models

Description:

Models not perfect...need an error term. Measurement errors, wrong model, omitted variables, inherent randomness ... Normal Quantile Plot. CLRM: Assumption 1 ... – PowerPoint PPT presentation

Number of Views:184
Avg rating:3.0/5.0
Slides: 28
Provided by: Cos147
Category:

less

Transcript and Presenter's Notes

Title: Linear Regression Models


1
Linear Regression Models
  • Powerful modeling technique
  • Quantify relationships between a dependent
    variable and 1 or more independent variables
  • Models not perfectneed an error term
  • Measurement errors, wrong model, omitted
    variables, inherent randomness
  • Linear models often misused.

2
Example Lake Water Quality
  • Chlorophyll-a (C) widely used indicator measure
    of eutrophication
  • Nitrogen (N) associated with eutrophication
  • Q Golf Course Development. Nitrogen expected to
    ?. By how much will C increase/decrease?
  • How should we proceed?

3
Plot C vs. N
4
A Better Model
  • Explain (single) regression line (model?).
  • Neg. relationship suggests a problem.
  • Omitted variable Phosphorus (P)
  • Want to tease out effect of N, P separately.
  • Write a Multiple Linear Regression Model
  • Model designed to tease out effect of N and
    effect of P, separately, on C.
  • () Define and interpret variables, parameters.

5
Estimation
  • Use data to estimate parameter values that give
    best fit b0-9.4, b10.3, b21.2
  • Answer A one unit increase in N, results in
    about a 1.2 unit increase in C.
  • Importance Omitting phosphorus from model
    introduced significant bias!!!

6
Question US Gas Consumption
  • Gasoline consumption produces many negative
    byproducts.
  • Policy may be directed at increasing the price of
    gas to reduce consumption.
  • But what is effect of price change?
  • Question What is the price elasticity of demand
    for gasoline in the U.S.?

7
Some Gasoline Data
8
Gas Data Contd
  • Gas consumption increases through time. But no
    info here about price.
  • Next plot shows () relationship between gas
    price and gas consumption.
  • Note opposite of demand curve.
  • Something is wrong here
  • Just as in Eutrophication problem, may have
    omitted important variables.
  • May have other problems, too.

9
The OLS Estimator
  • Estimator A rule or strategy for using data to
    estimate an unknown parameter. Defined before
    the data are drawn.
  • Ordinary Least Squares (OLS) estimator finds
    value of parameter that minimizes sum of squared
    deviations (see C vs. N plot)
  • Several assumptions for OLS estimator to apply to
    a model

10
Linear Model
  • The model must be linear
  • Linear in parameters, not in variables.
  • Difference between parameter, variable.
  • Examples

11
Transforming Models
  • Previous Ricker model is non-linear (in the
    parameter).
  • Sometimes, can transform model so linear.
  • When plot, graph is nonlinear.
  • Take log of both sides, giving

12
Whats a Residual?
  • General form of linear model
  • Graphically on board.

13
Residual Plots
  • Residuals vs. Fit
  • Normal Quantile Plot

14
CLRM Assumption 1
  • Dependent variable (Y) is function of specific
    set of independent variables (Xs).
  • Linear in parameters
  • Additive error
  • Coefficients are constant but unknown
  • Violations called specification errors, e.g.
  • Wrong regressors (a.k.a. indep. vars Xs)
  • Nonlinearity
  • Changing parameters (e.g. through time)

15
CLRM Assumption 2
  • Disturbances (eis) are independently and
    identically distributed (0,s2)
  • Typically we assume ei N(0,s2)
  • Mean 0
  • Constant variance, s2 (but unknown)
  • Errors uncorrelated with one another
  • Example of violations
  • Measurement Bias (seep gas flux)
  • Heteroskedasticity (variance differs).
  • Autocorrelated Errors (disturbances correlated)

16
CLRM Assumption 3
  • It is possible to repeat the sample with same
    independent variables.
  • If had same levels of explanatory vars, would it
    be possible to generate same value of Y?
  • Common Violations
  • Errors in variables measurement error in X.
  • Autoregression when lagged dependent variable
    should be independent variable
  • Simultaneous Equations several relationships
    act jointly.

17
Properties of Estimators
  • Estimators have many properties.
  • 6 is an estimator, but not a very good one.
  • Two main properties we care about
  • Unbiased The expected distance of estimator from
    thing it is estimating is 0.
  • Efficient Small variance (spread)
  • 6 is biased, but has a very small variance
    (zero).
  • OLS estimator is unbiased and has minimum
    variance of all unbiased estimators.

18
Correlation vs. Causation
  • Now we know just enough to be dangerous!
  • Can estimate how any set of variables affects
    some other variable.Very Powerful.
  • Problem is Correlation doesnt imply Causation!
    . Why Data Mining is bad.
  • Chicken consumption, Global CO2.
  • May be spurious (no underlying relationship)
  • Difficult to tease out statistically.
  • Granger Causality

19
Violations Consequences
20
Guide to Model Specification
  • Start with theory to generate model
  • Check assumptions of CLRM
  • Collect and plot data
  • Estimate model, test restrictions
  • Possibly perform Box-Cox transform
  • Check R2, and Adjusted R2
  • Plot residuals look for patterns
  • Seek explanations for patterns

21
Back to Gasoline Consumption
  • Recall, interested in how gas consumption is
    affected by price increase (say 0.10/gal.)
  • Variables
  • Gas consumption per capita (G)
  • Gas price (Pg)
  • Income (Y)

22
2 Alternative Specifications
  • Linear specification
  • Log-log specification (often used with economic
    data)

23
Results of Linear Model
  • Call lm(formula Consumption Price Income,
    data GasMarket, na.action na.exclude)
  • Residuals
  • Min 1Q Median 3Q Max
  • -35.85 -28 -0.5207 25.67 38.22
  • Coefficients
  • Value Std. Error t value
    Pr(gtt)
  • (Intercept) 145.2968 25.9323 5.6029
    0.0000
  • Price -85.2778 29.8378 -2.8581
    0.0073
  • Income 0.0191 0.0027 7.2224
    0.0000
  • Residual standard error 26.53 on 33 degrees of
    freedom
  • Multiple R-Squared 0.7753
  • F-statistic 56.94 on 2 and 33 degrees of
    freedom, the p-value is 1.999e-011

24
Answer to Question
  • A 1 unit increase in price leads to a 85 unit
    decrease in gas consumption.
  • Units are G(gallons), Pg().
  • So, a 0.10 increase in gas price leads to, on
    average, an 8.5 gallon decrease in gas
    consumptionnot much!

25
Residuals vs. fitted
26
Residuals QQ
27
Actual vs. fitted
Write a Comment
User Comments (0)
About PowerShow.com