Linear Regression with One Predictor Variable - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Linear Regression with One Predictor Variable

Description:

The means of these distributions vary in some systematic fashion with X. Note ... (Y) for a sample of children aged 5-10 will show a positive regression relation. ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 22
Provided by: wes99
Learn more at: https://www.westga.edu
Category:

less

Transcript and Presenter's Notes

Title: Linear Regression with One Predictor Variable


1
Linear Regression with One Predictor Variable
  • Ayona Chatterjee
  • Spring 2008
  • Math 4803/5803

2
Introduction
  • Statistical methodology that utilizes the
    relationship between two quantitative variables.
  • Use explanatory variables (independent, X) to
    predict the outcome/response (dependent, Y).
  • Introduced by Sir Francis Galton while studying
    heights of offspring and parents.

3
Examples
  • Sales of a product can be predicted by utilizing
    the relationship between sales and amount of
    advertising expenditures.
  • Performance of a student in a test can be
    predicted using the students IQ and time spent
    studying.
  • Length of a hospital stay of a surgical patient
    can be predicted by using the relationship
    between the time in the hospital and the severity
    of the operation.

4
Types of Relations between Variables
  • Functional Relationship
  • A functional relationship between two variables
    is expressed by a mathematical formula. If X
    denotes the independent variable and Y denotes
    the dependent variable, a functional relation is
    of the form Y f(X).
  • Statistical Relationship
  • Unlike a functional relation, does not lie on a
    straight lie. There is scope for some error.

5
Example of Statistical Relation
  • Performance of 10 students were obtained at
    mid-semester and at end of semester for a
    statistics exam. The data are plotted in the next
    slide. The end of semester grades are taken as
    dependent (Y) and the mid term grades are assumed
    to be the explanatory variable (X).

6
(No Transcript)
7
Basic Concept
  • A tendency of the response variable Y to vary
    with the predictor variable X in a systematic
    fashion.
  • There is a probability distribution of Y for each
    level of X.
  • A scattering of points around the curve of
    statistical relationship.
  • The means of these distributions vary in some
    systematic fashion with X.

8
Note
  • A regression model can be linear or curvilinear.
  • A regression model can have more than one
    predictor variable.
  • We will look at multiple regression later on.

9
Construction of Regression Models
  • Selection of Predictor Variables.
  • Construct models with limited number of
    explanatory variables to have a practical model.
  • Choose variables that help in reducing variation
    in Y.
  • Functional Form of Regression Relation.
  • Depends on the explanatory variable.
  • May be available from existing literature.
  • Or else it has to be decided empirically once the
    data are collected.

10
Construction of Regression Models
  • Scope of Model.
  • The regression equation is only valid in the
    range of data used to obtain it.
  • Uses of Regression Analysis
  • Description
  • Control
  • Prediction

11
Regression and Causality
  • No cause-and-effect pattern is necessarily
    implied by the regression model.
  • Regression analysis by itself provides no
    information about causal patterns and must be
    supplemented by additional analyses.
  • Example Data on size of vocabulary (X) and
    writing speed (Y) for a sample of children aged
    5-10 will show a positive regression relation.
    This does not imply that an increase in
    vocabulary causes faster writing speed.

12
Simple Linear Regression Model
  • With only one predictor, the model is as follows
  • Where
  • Yi is the value of the response variable in the
    ith trail.
  • ß0 and ß1 are parameters
  • Xi us a known constant, value of the predictor
    variable in the ith trail.
  • ei is the random error term with mean 0, variance
    s2 and covariance zero.
  • i 1n

13
Meaning of Regression Parameters
  • The parameters ß0 and ß1 are called regression
    coefficients.
  • Here ß1 is the slope of the regression line and
    indicates the change in the mea of the
    probability distribution of Y per unit increase
    in X.
  • When sensible, ß0 is the mean of the probability
    distribution for Y when X 0.

14
Example
  • A consultant for an electrical distributor is
    studying the relationship between the number of
    bids requested by construction contractor for
    basic lighting equipment during a week and the
    time required to prepare the bids. Let X be the
    number of bids prepared in a week and Y is the
    number of hours required to prepare the bids.
  • Suppose the regression function is
  • Y 9.5 2.1 X e
  • Here slope 2.1 indicates the preparation of one
    additional bid in a week leads to an increase in
    the mean of the probability distribution of Y of
    2.1 hours.
  • Here X0 is of no practical use so ß0 has no
    particular meaning.

15
Data for Regression Analysis
  • Observational Data
  • Obtained from non-experimental studies.
  • Experimental Data
  • Completely Randomized Data (CRD)
  • All combinations of experimental unit has an
    equal chance to receive any one of the
    treatments.
  • For all our studies we shall you CRD.

16
Estimation of Regression Function
  • We will use the method of Least squares to obtain
    estimates for ß0 and ß1.
  • Lets do it by hand!
  • The Gauss-Markov theorem gives us that b0 and b1
    are unbiased and have minimum variance among all
    unbiased linear estimator.

17
Residuals
  • The ith residual is the difference between the
    observed value Yi and the corresponding fitted
    value. This residual is denoted by ei an is
    defined in general as follows
  • Where the fitted value is given by
  • Remember residuals are known where as the error
    term ei from the model is unknown.

18
Some Properties of Fitted Regression Line
  • Sum of residual is zero
  • The sum of the squared residuals is
    minimum.
  • Sum of the observed values equal the sum of the
    fitted values.
  • The regression line always goes through
  • .

19
Estimation of Error Terms Variance
  • The mean square error (MSE) is used to estimate
    the error variance of the data s2.
  • MSE is an unbiased estimate for s2.
  • Here

20
Normal Error Regression Model
  • This is the same model as described before only
    the with additional assumption that the error
    term ei is Normally distributed with mean 0 and
    variance s2.
  • For all our regression models we will assume
    normal error terms.

21
Practice Problem
  • Look at the data sheet given to you and answer
    the questions.
Write a Comment
User Comments (0)
About PowerShow.com