Least Squares Regression - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Least Squares Regression

Description:

Least Squares Regression Fitting a Line to Bivariate Data The Least Squares Line Always goes Through ( x, y ) (x, y ) = (2.9, 4.39) Using the least squares line for ... – PowerPoint PPT presentation

Number of Views:937
Avg rating:3.0/5.0
Slides: 55
Provided by: Department915
Category:

less

Transcript and Presenter's Notes

Title: Least Squares Regression


1
Least Squares Regression
  • Fitting a Line to Bivariate Data

2
Linear Relationships
  • Avg. occupants per car
  • 1980 6/car
  • 1990 3/car
  • 2000 1.5/car
  • By the year 2010 every fourth car will have
    nobody in it!
  • Food for Thought
  • Kind of mathematical relationship between year
    and avg. no. of occupants per car?
  • Why might relation-
  • ship break down by 2010?

3
Basic Terminology
  • Scatterplots, correlation interested in
    association between 2 variables (assign x and y
    arbitrarily)
  • Least squares regression does one quantitative
    variable explain or cause changes in another
    variable?

4
Basic Terminology (cont.)
  • Explanatory variable explains or causes changes
    in the other variable the x variable.
    (independent variable)
  • Response variable the y -variable it responds
    to changes in the x - variable. (dependent
    variable)

5
Examples
  • Fertilizer (x ) corn yield (y )
  • Advertising (x ) store income (y )
  • Drug dose (x ) blood pressure (y )
  • Daily temperature (x )
  • natural gas demand (y )
  • change in min wage(x)
  • unemployment rate (y)

6
Simplest Relationship
  • Simplest equation that describes the dependence
    of variable y on variable x
  • y b0 b1x
  • linear equation
  • graph is line with slope b1 and y-intercept b0

7
Graph
yb0 b1x
y
rise
Slope brise/run
b0
run
x
0
8
Notation
  • (x1, y1), (x2, y2), . . . , (xn, yn)
  • draw the line y b0 b1x through the scatterplot
    , the point on the line corresponding to xi is

9
Observed y, Predicted y
predicted y when x2.7 yhat a bx a
b2.7
2.7
10
Scatterplot Fuel Consumption vs Car Weight
Best line?
11
Scatterplot with least squares prediction line
12
How do we draw the line? Residuals
13
Residuals graphically
14
Criterion for choosing what line to draw method
of least squares
  • The method of least squares chooses the line that
    makes the sum of squares of the residuals as
    small as possible
  • This line has slope b1 and intercept b0 that
    minimizes

15
Least Squares Line y b0 b1x Slope b1 and
Intercept b0
16
Example Income vs Consumption Expenditure
17
Questions
  • Construct scatterplot determine if linear model
    is appropriate. If so
  • find the least squares prediction line
  • Estimate consumption expenditure in a household
    with an income of (i) 6,000 (ii) 25,000.
    Comfortable with estimates?
  • Compute the residuals

18
Scatterplot
19
Solution
20
Calculations
21
least squares prediction line
22
Least Squares Prediction Line
23
Consumption Expenditure Prediction When x6,000
7.4
6
24
Consumption Expenditure Prediction When x25,000
11.2
25
25
The least squares line always goes through the
point with coordinates (x, y)
( x, y ) ( 9, 8 )
26
C. Compute the Residuals
27
Residuals
28
Income Residual Plot
29
Sresiduals, S(residuals)2
  • Note that
  • Sresiduals 0
  • S(residuals)2 3.6
  • From formula in box on p. 7
  • SSE?yi2 b0?yi b1?xiyi
  • 330 6.240 - .2392
  • 330 248 78.4 3.6
  • Any other line drawn through the scatterplot will
    have
  • S(residuals)2 gt 3.6

30
Car Weight, Fuel Consumption Example, cont.
  • (xi, yi) (3.4, 5.5) (3.8, 5.9) (4.1, 6.5) (2.2,
    3.3)
  • (2.6, 3.6) (2.9, 4.6) (2, 2.9) (2.7, 3.6) (1.9,
    3.1) (3.4, 4.9)

31
Wt (x) Fuel (y)
3.4 5.5 .5 .25 1.11 1.231 .555
3.8 5.9 .9 .81 1.51 2.2801 1.359
4.1 6.5 1.2 1.44 2.11 4.4521 2.532
2.2 3.3 -.7 .49 -1.09 1.1881 .763
2.6 3.6 -.3 .09 -.79 .6241 .237
2.9 4.6 0 0 .21 .0441 0
2.0 2.9 -.9 .81 -1.49 2.2201 1.341
2.7 3.6 -.2 .04 -.79 .6241 .158
1.9 3.1 -1.0 1 -1.29 1.6641 1.29
3.4 4.9 .5 .25 .51 .2601 .255
29 43.9 0 5.18 0 14.589 8.49
col. sum
32
Calculations
33
Scatterplot with least squares prediction line
34
The Least Squares Line Always goes Through ( x, y
)
(x, y ) (2.9, 4.39)
35
Using the least squares line for prediction. Fuel
consumption of 3,000 lb car? (x3)
36
Be Careful!
Fuel consumption of 500 lb car? (x .5)
x .5 is outside the range of the x-data that we
used to determine the least squares line
37
Avoid GIGO! Evaluating the least squares line
  1. Create scatterplot. Approximately linear?
  2. Calculate r2, the square of the correlation
    coefficient
  3. Examine residual plot

38
r2 The Variation Accounted For
  • The square of the correlation coefficient r gives
    important information about the usefulness of the
    least squares line

39
r2 important information for evaluating the
usefulness of the least squares line
-1 r 1 implies 0 r2 1
The square of the correlation coefficient, r2, is
the fraction of the variation in y that is
explained by the least squares regression of y on
x.
The square of the correlation coefficient, r2, is
the fraction of the variation in y that is
explained by the variation in x.
40
Example car weight, fuel consumption
  • xcar weight, yfuel consumption
  • r2 (.9766)2 ? .95
  • About 95 of the variation in fuel consumption
    (y) is explained by the linear relationship
    between car weight (x) and fuel consumption (y).
  • What else affects fuel consumption?
  • Driver, size of engine, tires, road, etc.

41
Example SAT scores
42
SAT scores calculations
43
SAT scores result
r2 (-.868)2 .7534
If 57 of NC seniors take the SAT, the predicted
mean score is
44
Avoid GIGO! Evaluating the least squares line
  1. Create scatterplot. Approximately linear?
  2. Calculate r2, the square of the correlation
    coefficient
  3. Examine residual plot

45
Residuals
  • residual observed y - predicted y
  • y - y
  • Properties of residuals
  • The residuals always sum to 0 (therefore the mean
    of the residuals is 0)
  • The least squares line always goes through the
    point (x, y)

46
Graphicallyresidual y - y
  • y
  • yi
  • yi eiyi - yi
  • X
  • xi

47
Residual Plot
  • Residuals help us determine if fitting a least
    squares line to the data makes sense
  • When a least squares line is appropriate, it
    should model the underlying relationship nothing
    interesting should be left behind
  • We make a scatterplot of the residuals in the
    hope of finding
  • NOTHING!

48
Car Wt/ Fuel Consump Residuals
  • CAR WT. FUEL CONSUMP. Pred FUEL CONSUMP.
    Residuals
  • 3.4 5.5 5.2094980690 .290501931
  • 3.8 5.9 5.865096525 0.034903475
  • 4.1 6.5 6.356795367 0.143204633
  • 2.2 3.3 3.242702703 0.057297297
  • 2.6 3.6 3.898301158 -0.29830115
  • 2.9 4.6 4.39 0.21
  • 2 2.9 2.914903475 -0.01490347
  • 2.7 3.6 4.062200772 -0.46220077
  • 1.9 3.1 2.751003861 0.348996139
  • 3.4 4.9 5.209498069 -0.309498069

49
Example Car wt/fuel consump. residual plot page
13
50
SAT Residuals
51
Linear Relationship?
52
Garbage In Garbage Out
53
Residual Plot Clue to GIGO
54
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com