Cautions about Correlation and Regression - PowerPoint PPT Presentation

About This Presentation
Title:

Cautions about Correlation and Regression

Description:

Cautions about Correlation and Regression Residuals A residual is the difference between an observed value of the dependent variable and the value predicted by the ... – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 9
Provided by: ITCU6
Learn more at: https://www.uky.edu
Category:

less

Transcript and Presenter's Notes

Title: Cautions about Correlation and Regression


1
Cautions about Correlation and Regression
2
Residuals
  • A residual is the difference between an observed
    value of the dependent variable and the value
    predicted by the regression line.

3
Residuals
  • Residuals show how far the data fall from our
    regression line.
  • They help us assess the fit of a regression line.
  • The mean of the least-squares residuals is always
    0.
  • A residual plot is a scatterplot of the
    regression residuals against the independent
    variable.

4
Outliers and Influential Observations
  • An outlier is an observation that lies outside
    the overall pattern of the other observations.
  • Points that are outliers in the y direction of a
    scatterplot have large residual values.
  • An observation is influential for a statistical
    calculation if removing it would markedly change
    the result of the calculation.
  • Points that are outliers in the x direction of a
    scatterplot are often influential for the
    least-squares regression line.

5
Beware!
  • Correlation measures only linear association, and
    fitting a straight line makes sense only when the
    overall pattern of the relationship is linear.
  • Extrapolation often produces unreliable
    predictions.
  • Correlation and least-squares regression are
    affected by outliers and influential points.

6
Correlation based on averages
  • A correlation based on averages over many
    individuals is usually higher than the
    correlation between the same variables based on
    data for individuals.

7
Explaining association
  • Even when direct causation is present, it is
    rarely a complete explanation of an association
    between two variables.
  • Even well established causal relations may not
    generalize to other settings.

8
Warning!
  • Two variables are confounded when their effects
    on a response variable cannot be distinguished
    from each other.
  • Even a strong association between 2 variables is
    not by itself good evidence that there is a
    cause-and-effect link between the variables.
  • Review criteria on page 184
Write a Comment
User Comments (0)
About PowerShow.com