Sociology 601 Martin Lecture 17: November 25 - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Sociology 601 Martin Lecture 17: November 25

Description:

More properties of the correlation coefficient r: ... Finding a correlation coefficient using STATA ... also called the coefficient of determination. Properties ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 13
Provided by: smar159
Learn more at: https://socy.umd.edu
Category:

less

Transcript and Presenter's Notes

Title: Sociology 601 Martin Lecture 17: November 25


1
Sociology 601 (Martin)Lecture 17 November 25
  • Correlation (Agresti and Finlay 9.4)
  • The problem we need to fix rescaling one or both
    axes changes the slope b.
  • Example murder rate and poverty rate for 50 US
    States. Yhat -.86 .58X ,
  • where Y murder rate per 100,000 per year
  • and X poverty rate per 100
  • If we rescale the murder rate to murders per 100
    persons per year, then Yhat -.00086 .00058X
  • (does this mean the association is now weaker?)

2
the correlation a standardized slope
  • An accepted solution for the problem of scale is
    to standardize both axes, then calculate the
    slope.
  • b ?Y / ?X
  • r (?Y /sY)/(?X /sX) (?Y /?X )(sX/sY)
    b(sX/sY)
  • where

3
More on the correlation.
  • r is called
  • the Pearson correlation (or simply the
    correlation)
  • the standardized regression coefficient (or the
    standardized slope)
  • r b(sX/sY)
  • r is a sample statistic we use to estimate a
    population parameter ?

4
Measuring linear association the correlation.
  • calculating r for the murder and poverty example
  • b .58, sX 4.29, sY 3.98
  • r b(sX/sY) .58(4.29/3.98) .629 .63
  • alternatively (if the murder rate is per 100
    persons),
  • b .00058, sX 4.29, sY .00398
  • r b(sX/sY) .00058(4.29/.00398) .629 .63

5
Properties of the correlation coefficient r
  • 1 ? r ? 1
  • r can be positive or negative, and has the same
    sign as b.
  • r 1 when all the points fall exactly on the
    prediction line.
  • The larger the absolute value of r, the larger
    the degree of linear association.
  • r 0 when there is no linear trend in the
    relationship between X and Y.

6
More properties of the correlation coefficient r
  • The value of r does not depend on the units of X
    and Y.
  • The correlation treats X and Y symmetrically
  • (unlike the slope ß)
  • this means that a correlation says nothing about
    causal direction!
  • The correlation is valid only when a straight
    line is a reasonable model for the relationship
    between X and Y.

7
Examples of the correlation coefficient r
  • b 1, r 1
  • b 5, r 1
  • b .2, r 1
  • b -1, r -1
  • b .5, r .8
  • b .5, r .3
  • b 0, linear assumption holds
  • b 0, linear assumption does not hold

8
Finding a correlation coefficient using STATA
  • Recall the religion and state control study,
    where high levels of state regulation were
    associated with low levels of weekly church
    attendance.
  • . correlate attend regul
  • (obs18)
  • attend regul
  • -------------------------------
  • attend 1.0000
  • regul -0.6133 1.0000

9
An alternative interpretation of r proportional
reduction in error
  • Old interpretation for murder and poverty
    example
  • r .63, the murder rate for a state is expected
    to be higher by 0.63 standard deviations for each
    1.0 standard deviation increase in the poverty
    rate.
  • New interpretation
  • by using poverty rates to predict murder rates,
    we explain ?? percent of the variation in states
    murder rates.

10
Proportional reduction in error
  • Predicting Y without using X
  • Y Ybar e1
  • E1 ? e12 ? (observed Y predicted Y)2
  • Predicting Y using X
  • Y Yhat e2 a bX e2
  • E2 ? e22 ? (observed Y predicted Y)2
  • Proportional reduction in error
  • r2 PRE (E1 E2 ) / E1 (TSS SSE) / TSS

11
Proportional reduction in error.
  • calculating r 2 for the murder and poverty
    example
  • r 2 .629 2 .395
  • alternatively (using computer output),
  • r 2 (TSS SSE) / TSS (777.7 470.4)/777.7
    .395
  • interpretation 39.5 of the variation in
    states murder rates is explained by its linear
    relationship with states poverty rates.

12
Proportional reduction in error.
  • r 2 is also called the coefficient of
    determination.
  • Properties of r 2
  • 0 ? r 2 ? 1
  • r 2 1 (its maximum value) when SSE 0.
  • r 2 0 when SSE TSS. (furthermore, b 0)
  • the higher r 2 is, the stronger the linear
    association between X and Y.
  • r 2 does not depend on the units of measurement.
  • r 2 takes the same value when X predicts Y as
    when Y predicts X.
Write a Comment
User Comments (0)
About PowerShow.com