Statistical Methods for Multicenter Interrater Reliability Study - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Statistical Methods for Multicenter Interrater Reliability Study

Description:

Can BOTOX improve arm or hand function of a post stroke spasticity ... Thanks to Allergan Biostatistics and BOTOX-Neurology for the supports on this research. ... – PowerPoint PPT presentation

Number of Views:415
Avg rating:5.0/5.0
Slides: 16
Provided by: alle153
Category:

less

Transcript and Presenter's Notes

Title: Statistical Methods for Multicenter Interrater Reliability Study


1
Statistical Methods for Multicenter Inter-rater
Reliability Study
  • Jingyu Liu
  • Allergan, Inc., USA

2
Overview
  • Introduction
  • Multicenter inter-rater reliability
  • Statistical methods
  • References
  • Acknowledgement

3
Introduction
  • Can BOTOX improve arm or hand function of a post
    stroke spasticity patient?

4
Function Assessments and Validation
  • Identify function items
  • Identify the measurement scale
  • Are the function assessments reliable, sensitive,
    and clinically meaningful?
  • What is the minimum important change (MIC) of the
    assessment?
  • Assessment (scale) validation reliability and
    validity
  • Inter-rater reliability
  • One of the key steps in clinical outcome
    assessment scale validation.

5
Multicenter inter-rater reliability
  • Inter-rater reliability study
  • In a clinical setting, an inter-rater
    reliability study is to evaluate the agreement
    among different raters (or physicians) by using
    the same assessment scale on each of the subjects
    (or patients) enrolled in the study.
  • Multicenter inter-rater reliability study
  • It is an inter-rater reliability study conducted
    in multiple clinical study centers (or sites).
  • Why conduct multicenter inter-rater reliability
    study?
  • It is difficult to conduct a large inter-rater
    reliability study in a single clinical study
    center.
  • It is closer to an actual multicenter clinical
    study setting (preferred by some regulatory
    agency).

6
Statistical Methods
  • Will focus on the following
  • Inter-class correlation coefficient (ICC) based
    on the ANOVA method under a random effect model
  • Introduce a new approach to evaluate the
    inter-rater reliability
  • Nonparametric methods
  • Discuss the other methods

7
Statistical Methods
  • Assessments at the i-th study center (i1,
    2, ,a)

8
Statistical Methods
  • ANOVA method to evaluate the inter-rater
    reliability by an inter-class correlation
    coefficient (ICC) under a random effect model
  • Statistical inference on is
    straightforward.

9
Statistical Methods
  • The ICC is the Pearson correlation
    coefficient of the assessments made by two raters
    on a same subject, i.e.

  • where
  • The Pearson correlation coefficient measures a
    linear relationship of two variables. It does not
    necessarily measure the absolute agreement.
  • As noted by Lin 1 in a special case of two
    raters (or two assays), a good agreement between
    the two raters requires that the plot of the
    assessments of the two raters falls closely on a
    line through the origin.
  • The ICC is influenced by the subject variation.
    In an inter-rater reliability study, the subject
    variation itself is not of interest and the
    subjects are usually not randomly selected.

10
Statistical Methods
  • For an interval variable with a domain a, b,
    Liu 2 introduce a new agreement coefficient to
    evaluate the inter-rater reliability as follows

  • or
  • where is the theoretical range of the
    variable (i.e., b a), is the
    measurement variance of the raters, and is
    a pre-defined reliability scale parameter.
  • is invariant under a linear transformation.
    With this property we can compare the
    reliabilities of the outcome assessment scales
    under different domains.
  • Most clinical outcome assessments are either
    interval variable or ordinal numerical variable.
    is applicable to these variables.

11
Statistical Methods
  • For a multicenter inter-rater reliability
    study, can be estimated by
  • where and
  • When there are only two raters within each
    center, we have

12
Statistical Methods
  • Statistical inferences such as hypothesis test,
    confidence interval, and power calculation are
    obtained based on the results provided in Liu
    2.
  • As a special case of two raters, the can be
    applied in evaluating the test-retest reliability
    or the intra-rater reliability.
  • In clinical research, we select in
    calculating
  • can be applied in more complicated
    situations such as imbalanced data as well as
    data with missing values.

13
Statistical Methods
  • Nonparametric approaches
  • When the distribution assumption or the model
    assumption is not appropriate, we can use the
    nonparametric approach to evaluating the
    inter-rater reliability.
  • For a single center study, we can directly use
    Kendalls W
  • For a multicenter study, Liu3 introduced a new
    statistic which can be considered as an extension
    of Kendalls W to evaluate the inter-rater
    reliability. The distribution property,
    hypothesis test, confidence interval, etc. are
    provided.
  • Other weighted approaches
  • Other Statistical methods

14
References
  • Lin LI. A concordance correlation coefficient to
    evaluate reproducibility. Biometrics 1989 45,
    255-268.
  • Liu J. Measures of inter-rater reliability for
    interval data. (submitted for publication).
  • Liu J. A nonparametric approach to evaluating
    multicenter inter-rater reliability. (submitted
    for publication).
  • Shoukri M. Measures of Interobserver Agreement.
    Chapman Hall/CRC. 2004
  • Fleiss JL. The Design and Analysis of Clinical
    Experiments. John Wiley Sons. 1986.
  • Schuck P. Assessing reproducibility for interval
    data in health-related quality of life
    questionnaires Which coefficient should be used?
    Quality of Life Research 2004 13, 571-586.
  • Searle SR. Linear Model. John Wiley Sons. 1971.
  • Kendall MG. A new measure of rank correlation.
    Biometrika 1938 3081-93.
  • Mehta C, Patel N, Proc-StatXact 4, CYTEL Software
    Corporation, 1999.
  • Friedman M. The use of ranks to avoid the
    assumption of normality implicit in the analysis
    of variance. Journal of American Statistical
    Association, 1937 32675-701.

15
Acknowledgement
  • Thanks to Allergan Biostatistics and
    BOTOX-Neurology for the supports on this
    research. I also wish to thank the supports from
    the sponsors of the 2006 International Conference
    on Design of Experiments and Its Application.
Write a Comment
User Comments (0)
About PowerShow.com