Introduction to Bivariate Regression - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

Introduction to Bivariate Regression

Description:

X1 and X2 are the independent variables (also called predictors or regressors) ... let's consider the simplest case where we have just one independent variable. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 12
Provided by: homeUc
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Bivariate Regression


1
Introduction to Bivariate Regression

2
Agenda
  • What is a regression model?
  • Derivation of the OLS estimators
  • Evaluation of Model Fit

3
What is a regression?
  • A regression is a statistical method for studying
    the relationship between a single dependent
    variable and one or more independent variables.
  • In its simplest form a regression specifies a
    linear relationship between the dependent and
    independent variables.
  • Yi b0 b1 X1i b2 X2i ei
  • for a given set of observations
  • In the social sciences, a regression is generally
    used to represent a causal process.
  • Y represents the dependent variable
  • B0 is the intercept (it represents the predicted
    value of Y if X1 and X2 equal zero.)
  • X1 and X2 are the independent variables (also
    called predictors or regressors)
  • b1 and b2 are called the regression coefficients
    and provide a measure of the effect of the
    independent variables on Y (they measure the
    slope of the line)
  • e is the stuff not explained by the causal model.
  • What might be an example of a regression?

4
Interpreting the regression line.
  • Example. Incomei b0 b1Years in Schooli ei
  • b0 The intercept is simply the expected value of
    Y that would hold if all of the independent
    variable equaled zero.
  • So, b0 corresponds to the income of someone who
    had zero years in school
  • b1 The regression coefficient provides a measure
    of the slope of the regression line. The most
    useful way to interpret it is as the effect of a
    one unit change in the independent variable.
  • That is, b1 can be interpreted as the effect of
    a one unit change in the independent variable,
    such as the effect of going from 12 years to 13
    years in school.

5
Why use regression?
  • Regression is used as a way of testing hypotheses
    about causal relationships.
  • Specifically, we have hypotheses about whether
    the independent variables have a positive or a
    negative effect on the dependent variable.
  • From our earlier example, what are our hypotheses
    about the factors that influence Y?
  • Just like in our hypothesis tests about variable
    means, we also would like to be able to judge how
    confident we are in our inferences (we will get
    back to this later).

6
How would we estimate a regression?
  • To begin, lets consider the simplest case where
    we have just one independent variable.
  • Yi b0 b1X1i ei
  • How would you estimate the intercept and the
    regression coefficient?
  • Draw a scatterplot on the board and stop using
    PowerPoint.

7
A regression minimizes the error in the models
fit
  • The most reasonable way to estimate a regression
    is to try to minimize the error in the line
    created by your regression estimates.
  • Note you would not usually be able to just sum
    the errors because the sum would be zero.
  • Instead you would either minimize the mean
    squared error or the mean absolute error.
  • We try to minimize the mean squared error because
    we can use calculus to get the answer (and
    because the solution we get when we try to
    minimize mean squared error has desirable
    properties.).
  • Why might we prefer to minimize mean absolute
    error?
  • Because our regression line is more resistant to
    outliers.

8
Setting up the minimization problem
  • What we want to do is to minimize the mean
    squared error in the equation
  • Yi b0 b1X1i ei for observations i 1, ,
    n
  • If we rearrange terms
  • ei Yi - b0 - b1X1i and ei2 (Yi - b0 -
    b1X1i)2
  • So, to minimize the mean squared error, we need
    to minimize the following equation
  • ?ei2 ?i ( Yi - b0 - b1X1i )2
  • With respect to the regression coefficients b0
    and b1.

9
Useful Properties of Summations
  • Rule 1. ?i k Xi k ?i Xi
  • Rule 2. ?i ( Xi Yi ) ?i Xi ?i Yi
  • Rule 3. ?i k k n
  • Rule 4. ?i Xi n Mean(X)
  • Rule 5. ?i ( Xi Mean(X) ) 0

10
Solving the minimization problem
  • To minimize the expression, take the derivative
    with respect to b0 and then with respect to b1,
    set both expressions equal to zero, and solve.
  • ?ei2 ?i ( Yi - b0 - b1X1i )2
  • ??ei2/?b0 2?i ( Yi - b0 - b1X1i )(-1) 0
  • ?i Yi - b0 ?i 1 - b1 ?i X1i 0
  • (n Mean(Y)) - (n b0) - (n b1 Mean(X))
    0
  • b0 Mean(Y) - b1 Mean(X)
  • ??ei2/?b1 2?i ( Yi - b0 - b1X1i )(-X1i) 0
  • ?i ( Yi - b0 - b1X1i )(-X1i) 0
  • Substitute for b0 and rearrange terms
  • b1?i (X1i2) ?i YiX1i Mean(Y) - b1
    Mean(X) ?i X1i
  • ?i YiX1i Mean(Y) - b1 Mean(X) n
    Mean(X)
  • ?i YiX1i n Mean(X)Mean(Y) - b1 n
    Mean(X)2
  • b1?i (X1i2) - n Mean(X)2 ?i YiX1i n
    Mean(X)Mean(Y)
  • b1 ?i YiX1i n Mean(X)Mean(Y) / ?i (X1i2) -
    n Mean(X)2
  • If Mean(X) Mean(Y) 0, then we get

11
Example
  • Lets run through an example in Excel.
  • Come up with something that uses real data
  • (Crime and Temperature)
Write a Comment
User Comments (0)
About PowerShow.com