Measures of Association - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Measures of Association

Description:

Ty = (23 x 3) (5 x 68) Goodman & Kruskal's Gamma. C = 23 ... Ty = (23 x 3) (5 x 68) = 409. Linearity and the Limits of Gama and Tau-b. Level of Interest in ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 18
Provided by: art782
Category:

less

Transcript and Presenter's Notes

Title: Measures of Association


1
Measures of Association
  • Political Science 102
  • Introduction to Political Inquiry
  • Lecture 20

2
Why Use Measures of Association?
  • Cross-tabs and scatter plots are flexible tools
    for exploring relationships between variables
  • Chi-squared test evaluates statistical
    significance
  • Neither method provides a summary measure of the
    relationship
  • What is the direction?
  • How strong is the relationship?
  • Measures of Association seek to provide this
    information

3
Ordinal Linear Measures
  • Coefficient compares pairs of cases record them
    as concordant, discordant, or tied
  • Concordant case 1 is higher (or lower) than
    case 2 on both X and Y
  • Discordant case 1 is lower than case 2 on X,
    but higher than case 2 on Y (or vice versa)
  • Tied case 1 and case 2 are equal on either X,
    or Y, or both
  • Positive coefficient indicates more concordant
    than discordant pairs negative coefficient
    indicates more discordant pairs than condordant

4
Ordinal Linear Measures
  • Coefficients vary in how they weight and account
    for ties
  • Gamma ignores ties (may ignore much of the data)
  • Tau-b uses a weighted average of ties on X and Y
  • All of these coefficients focus on linear
    relationships (or at least monotonic)
  • Curvilinear and contingent relationships may be
    masked by these procedures

5
Goodman Kruskals Gamma
C Concordant pairs D Discordant pairs
C 23 x 68 1,564 D 5 x 3 15 Tx (23 x 5)
(3 x 68) Ty (23 x 3) (5 x 68)
6
Goodman Kruskals Gamma
C 23 x 68 1,564 D 5 x 3 15 Tx (23 x 5)
(3 x 68) 319 Ty (23 x 3) (5 x 68) 409
7
Kendalls Tau-B
C 23 x 68 1,564 D 5 x 3 15 Tx (23 x 5)
(3 x 68) 319 Ty (23 x 3) (5 x 68) 409
8
Linearity and the Limits of Gama and Tau-b
Level of Interest in Politics/Current
Ideology Events
Very Libe Liberal Moderate Conservat
Very Cons Total ---------------------------
-------------------------------------------------
---------- Not much interested 15
46 100 62 21 244
4.69 7.01
8.25 6.78 4.36 6.81
------------------------------------------------
-------------------------------------- Somewhat
Interested 62 232 461
272 96 1,123
19.38 35.37 38.04 29.73
19.92 31.32 -----------------------------
-------------------------------------------------
-------- Very Much Interested 243
378 651 581 365 2,218
75.94 57.62
53.71 63.50 75.73 61.87
------------------------------------------------
--------------------------------------
Total 320 656 1,212
915 482 3,585
100.00 100.00 100.00 100.00
100.00 100.00 Pearson chi2(8)
106.8563 Pr 0.000 gamma
0.0748 ASE 0.023 Kendall's tau-b
0.0466 ASE 0.014
9
Correlation Coefficients
  • For ratio data we can construct measures of
    association with more information about distance
    between categories
  • Gamma and tau-b make only ordinal comparisons
  • Analysts sought to construct summary statistic
    that would allow comparison of the strength of
    the relationship despite different units of
    measure
  • The correlation coefficient!

10
Origins of the Correlation Coefficient
  • Analysts wanted to summarize how much changes in
    X and Y are associated with one another
  • But X and Y are on different scales with
    different levels of variation
  • Step 1 Measure the association of variation in X
    and Y by subtracting out the mean level of each
    variable

11
The Origins of the Correlation Coefficient
  • This formula focuses on deviations in X and Y,
    but X and Y are still measured in different units
  • Solution Divide deviations in X and Y by their
    respective standard deviations
  • Puts deviations in units of standard deviations

12
Aspirations of the Correlation Coefficient
  • Aims to be a unit-free measure of association
    that allows comparison of degrees of association
    across variables measured in different units
  • It FAILS on all counts!
  • Cannot compare correlation of X and Y to Z and Y
    because X and Z have different standard
    deviations
  • The goal of unit-free comparisons is
    wrong-headed
  • Cannot generalize correlations between the same
    variables across different samples because the
    standard deviations of the samples differ
  • Instead of being universally comparable,
    correlations are universally incommensurable!

13
What is the Solution?
  • DONT use correlation coefficients to make
    generalizable claims about the association
    between variables
  • Assess the strength of relationships by looking
    at
  • Variation across categories in cross-tabs
  • Difference of means or proportions tests
  • Scatter plots
  • Rely on chi-squared and t-tests for statistical
    significance
  • Rely on regression analysis to summarize the
    strength of relationships between variables

14
The Sample Dependence of Correlation Coefficients
  • I created a dataset with these characteristics
  • X1 varies from 20 to 20 where sx 10
  • Y is defined as Y132X1e
  • E is a random error term such that eN(0,20)
  • Thus we KNOW the true relationship between X
    and Y
  • We can change the sample to see if correlations
    are generalizable

15
Correlation Coefficients Depend on the Sample
. corr y1 x1 (obs100) y1
x1 --------------------------- y1
1.0000 x1 0.6401 1.0000
Analysis of Full Sample
. corr y1 x1 if x1gt-10 x1lt10 (obs70)
y1 x1 ---------------------------
y1 1.0000 x1 0.4655 1.0000
Analysis of Restricted Variation in X
Correlation coefficient drops by 1/3 due to
arbitrary changes in the sample
16
Regression Coefficients Are Generalizable
. reg y1 x1 Source SS df
MS Number of obs
100 ---------------------------------------
F( 1, 98) 68.03 Model
30717.8966 1 30717.8966 Prob gt
F 0.0000 Residual 44248.6995 98
451.517342 R-squared
0.4098 ---------------------------------------
Adj R-squared 0.4037 Total
74966.5961 99 757.238344 Root
MSE 21.249 ------------------------------
------------------------------------------------
y1 Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- x1 1.92611 .2335192
8.248 0.000 1.462699 2.389522 _cons
4.27945 2.130969 2.008 0.047
.0506116 8.508289 -----------------------------
-------------------------------------------------
Analyzing the Full Sample
17
Regression Coefficients Are Generalizable
. reg y1 x1 if x1gt-10 x1lt10 Source
SS df MS Number of
obs 70 -----------------------------------
---- F( 1, 68) 18.81
Model 8038.90856 1 8038.90856
Prob gt F 0.0000 Residual 29063.7331
68 427.407839 R-squared
0.2167 ---------------------------------------
Adj R-squared 0.2051 Total
37102.6416 69 537.719444 Root
MSE 20.674 ------------------------------
------------------------------------------------
y1 Coef. Std. Err. t
Pgtt 95 Conf. Interval ----------------
--------------------------------------------------
----------- x1 1.971043 .4544842
4.337 0.000 1.064134 2.877952 _cons
5.884096 2.517394 2.337 0.022
.8607142 10.90748 -----------------------------
-------------------------------------------------
Analyzing the Restricted Sample
Write a Comment
User Comments (0)
About PowerShow.com