Topics: Regression - PowerPoint PPT Presentation

About This Presentation
Title:

Topics: Regression

Description:

Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or more ... – PowerPoint PPT presentation

Number of Views:149
Avg rating:3.0/5.0
Slides: 27
Provided by: AnnPo7
Learn more at: http://web.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: Topics: Regression


1
Topics Regression
  • Simple Linear Regression one dependent variable
    and one independent variable
  • Multiple Regression one dependent variable and
    two or more independent variables.

2
Correlation
  • A correlation describes a relationship between
    two variables
  • Correlation tries to answer the following
    questions
  • What is the relationship between variable X and
    variable Y?
  • How are the scores on one measure associated with
    scores on another measure?
  • To what extent do the high scores on one variable
    go with the high scores on the second variable?

3
Simple Linear Regression
  • Understanding relationships between variables
  • Prediction
  • Explanation

4
Design Requirements and Assumptions
  • Two continuous variables
  • Variables are linearly related
  • Random Sampling
  • Independence
  • Bivariate Normality
  • N gt 30

5
Example
  • You are the admissions committee in the Sociology
    department of a large west coast University. You
    are trying to make decisions about who to admit
    to the Masters program. You would like to be
    able to predict how well the applicants you are
    deciding about will do at your school.
  • Your department has been analyzing the
    performance of its graduate students over the
    years. One thing it has been looking at it is
    relationship between undergraduate GPA and
    graduate GPA.
  • From regression analyses done over the years, you
    are able to make some educated guesses about how
    applicants will perform once admitted.

6
How Used in Making Predictions
7
The Regression Coefficient? What Slope? What
Altitude?
8
Fitting the Regression Line The Best Fit (Least
Squares)
  • Y' a byX
  • The predicted value of Y(Y') for a value of X is
    computed by
  • Multiplying a score (X) by the regression
    coefficient (by)
  • Adding the regression constant (a) to this
    product
  • The prediction of Y from X based on linear
    relationship of X and Y so that errors are
    minimized

9
Least Squares Fit Visual


Where the average squared distance of the points
from the regression line is minimized
10
Minimizing Prediction Error What that Means
(For Math Types)
11
The Regression Coefficient Close Your Eyes if
You Dont Want the Derivation
  • by rxy (sy/sx)
  • by regression coefficient
  • r correlation between X and Y
  • sy standard deviation of Y
  • sx standard deviation of X
  • Compute by divide the standard deviation of Y
    (sy) by the standard deviation of X (sx) then
    multiply by the Pearson correlation (rxy)between
    X and Y

12
The Constant (a) More Math
  • Regression Constant (a) the altitude of the
    regression line the value where the regression
    line intercepts Y where X 0 (the Y intercept)
  • a Y - byX
  • a the regression constant
  • Y mean of Y
  • by regression coefficient
  • X mean of X
  • Compute a multiply X (mean of X) by the
    regression coefficient (by) and then subtract
    that product from Y (mean of Y)

13
Plotting Regression Line
  • Need compute two predicted scores
  • For X (undergrad GPA) 2.75
  • Y a byX 2.93.24(2.75) 3.59
  • For X (undergrad GPA) 3.60
  • Y a byX 2.93.24(3.60) 3.79
  • Draw regression line through scatter plot using
    these two points

14
Plotting the Regression Line Visual


15
Errors of Prediction
16
Standard Error of Estimate
  • The magnitude of the error made in estimating Y
    from X a measure of dispersion around the
    regression line
  • The average error of prediction

17
The Standard Error of Estimate A Visual
Representation
4.00
3.75
3.75
Graduate GPA
3.50
3.25
3.25
3.00
3.25
3.00
3.50
3.75
4.00
Undergraduate GPA
18
Standard Error of Estimate Another Visual
Representation
Y
19
Is the prediction worth pursuing?
  • Standard error
  • Amount of variance explained by X
  • Testing the regression coefficient (b) for
    significance

20
Explaining Variance How much?
Predicted Variance
Total Variance
Y
Unpredicted Variance
21
Assessing Prediction Accuracy Explaining Variance
  • Total Variance Predicted variance Residual
    (unexplained) variance
  • Coefficient of Determination (r2)Proportion of
    total variance in Y that has been predicted by
    variable X (r2 s2y/s2y)
  • Our example r .56, so r2 .3136
  • Coefficient of Non-Determination (1-r2)
    Proportion of total variance in Y that is not
    predicted by X
  • Our example 1- r2 1- .31 .69

22
Proportion of Explained (Predicted) and
Unexplained (Residual) Variance
rxy .56
X
Y
(1-r2) .69 (69) Unexplained variance
r2.31 (31) Explained variance
23
t-Test for Individual Regression Coefficients (by)
  • H0 ? 0 (where ? is the population regression
    coefficient)
  • H1 ? not 0
  • Compute a t statistic
  • T (b - ?)/sb b/sb (how many standard error
    points b is from the hypothesized population
    parameter under the null hypothesis, ? 0 )

24
t-Test of b Our Example
  • t .24/.12 2.00
  • Set alpha at .05 (two-tailed)
  • Figure out df (N-2) 8
  • t critical (05/2,8) 2.306
  • Decision tobserved (2.00) lt tcritical (2.306) so
    do not reject the null hypothesis
  • Conclusion cannot conclude that the slope is
    significantly different from 0 in the population.

25
Our Conclusion Do not reject the null hypothesis


26
Warnings
  • Simple regression assumes a straight line
    relationship
  • Outliers can control regression results
  • Assumes random samples for making proper
    generalizations
  • Regression is correlational and does not show a
    causal link between x causes y
Write a Comment
User Comments (0)
About PowerShow.com