Measuring Agreement - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Measuring Agreement

Description:

Measuring Agreement – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 24
Provided by: Lynd84
Category:

less

Transcript and Presenter's Notes

Title: Measuring Agreement


1
Measuring Agreement
2
Introduction
  • Different types of agreement
  • Diagnosis by different methods
  • Do both methods give the same results?
  • Disease absent or Disease present
  • Staging of carcinomas
  • Will different methods lead to the same results?
  • Will different raters lead to the same results?
  • Measurements of blood pressure
  • How consistent are measurements made
  • Using different devices?
  • With different observers?
  • At different times?

3
Investigating agreement
  • Need to consider
  • Data type
  • Categorical or continuous
  • How are the data repeated?
  • Measuring instrument (s), rater(s), time(s)
  • The goal
  • Are ratings consistent?
  • Estimate the magnitude of differences between
    measurements
  • Investigate factors that affect ratings
  • Number of raters

4
Data type
  • Categorical
  • Binary
  • Disease absent, disease present
  • Nominal
  • Hepatitis
  • Viral A, B, C, D, E or autoimmune
  • Ordinal
  • Severity of disease
  • Mild, moderate, severe
  • Continuous
  • Size of tumour
  • Blood pressure

5
How are data repeated?
  • Same person, same measuring instrument
  • Different observers
  • Inter-rater reliability
  • Same observer at different times
  • Intra-rater reliability
  • Repeatability
  • Internal consistency
  • Do the items of a test measure the same
    attribute?

6
Measures of agreement
  • Categorical
  • Kappa
  • Weighted
  • Fleiss
  • Continuous
  • Limits of agreement
  • Coefficient of variation (CV)
  • Intraclass Correlation (ICC)
  • Cronbachs ?
  • Internal consistency

7
Number of raters
  • Two
  • Three or more

8
Categorical data two raters
  • Kappa
  • Magnitude quoted
  • 0.75 Excellent, 0.40 to 0.75 Fair to good, lt
    0.40 as Poor
  • 0 to 0.20 Slight, gt0.20 to 0.40 Fair, gt0.40 to
    0.60 Moderate, gt0.60 to 0.80 Substantial, gt0.80
    Almost perfect
  • Degree of disagreement can be included
  • Weighted kappa
  • Values close together do not count to
    disagreement as much as those further apart
  • Linear / quadratic weightings

9
Categorical data gt two raters
  • Different tests for
  • Binomial data
  • Data with more than two categories
  • Online calculators
  • http//www.vassarstats.net/kappa.html

10
Example 1
  • Two raters
  • Scores 1 to 5
  • Unweighted kappa 0.79, 95 CI (0.62 to 0.96)
  • Linear weighting 0.84, 95 CI (0.70 to 0.98)
  • Quadratic weighting 0.90, 95 CI (0.77 to 1.00)

11
Example 2
  • Binomial data
  • Two raters
  • Two ratings each
  • Inter-rater agreement
  • Intra-rater agreement

12
Example 2 ctd.
  • Inter-rater agreement
  • Kappa1,2 0.865 (Plt0.001)
  • Kappa1,3 0.054 (P0.765)
  • Kappa2,3 -0.071 (P0.696)
  • Intra-rater agreement
  • Kappa1 0.800 (Plt0.001)
  • Kappa2 0.790 (Plt0.001)
  • Kappa3 0.000 (P1.000)

13
Continuous data
  • Test for bias
  • Check differences not related to magnitude
  • Calculate mean and SD of differences
  • Limits of agreement
  • Coefficient of variation
  • ICC

14
Test for bias
  • Students paired t (mean)
  • Wilcoxon matched pairs (median)
  • If there is bias, agreement cannot be
    investigated further

15
Example 3 Test for bias
  • Paired t test
  • P0.362
  • No bias

16
Check differences unrelated to magnitude
  • Clearly no relationship

17
Calculate Mean and SD differences
N Mean Std. Deviation
Difference 17 4.9412 21.72404
Valid N (listwise) 17
18
Limits of agreement
  • Lower limit of agreement (LLA) mean - 1.96s
    -37.6
  • Upper limit of agreement (ULA) mean 1.96s
    47.5
  • 95 of differences between a pair of measurements
    for an individual lie in (-37.6, 47.5)

19
Coefficient of variation
  • Measure of variability of differences
  • Expressed as a proportion of the average measured
    value
  • Suitable when error (the differences between
    pairs) increases with the measured values
  • Other measures require this not to be the case
  • 100 s mean of the measurements
  • 100 21.72 447.88
  • 4.85

20
Intraclass Correlation
  • Continuous data
  • Two or more sets of measurements
  • Measure of correlation that adjusts for
    differences in scale
  • Several models
  • Absolute agreement of consistency
  • Raters chosen randomly or same raters throughout
  • Single or average measures

21
Intraclass Correlation
  • 0.75 Excellent
  • 0.4 to 0.75 Fair to Good
  • lt0.4 Poor

22
Cronbachs a
  • Internal consistency
  • Total scores
  • Several components.
  • a 0.8 good
  • 0.7 adequate

23
Investigating agreement
  • Data type
  • Categorical
  • Chi squared
  • Continuous
  • Limits of agreement
  • Coefficient of variation
  • Intraclass correlation
  • How are the data repeated?
  • Measuring instrument (s), rater(s), time(s)
  • Number of raters
  • Two
  • Straightforward
  • Three or more
  • Help!
Write a Comment
User Comments (0)
About PowerShow.com