Assessing Responsiveness of Health Measurements - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Assessing Responsiveness of Health Measurements

Description:

In a diagnostic instrument, inter-rater and test-retest reliability are important; For an evaluative measure, internal ... Point-biserial correlations ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 21
Provided by: IanMcD2
Category:

less

Transcript and Presenter's Notes

Title: Assessing Responsiveness of Health Measurements


1
Assessing Responsiveness of Health Measurements
  • Ian McDowell,
  • INTA, Santiago, March 20, 2001

2
Link Purpose of Measure to Validation Method
  • For example
  • In a diagnostic instrument, inter-rater and
    test-retest reliability are important
  • For an evaluative measure, internal consistency
    is paramount.
  • For a prognostic or diagnostic instrument,
    criterion validity is relevant
  • For an evaluative measure, construct validation
    is central.

3
Responsiveness
  • For outcome measures, sensitivity to change is a
    crucial characteristic
  • Responsiveness refers to how sensitive a
    measure is in indicating change over time or
    contrast between groups
  • Normally considered an element of validity for an
    evaluative measurement.

4
(Responsiveness, contd)
  • There is little consensus over how responsiveness
    should be assessed.
  • This may be because responsiveness requires a
    finer breakdown than is normally given
  • Different facets of responsiveness are relevant
    to different types of measure.

5
Conceptions of Responsiveness
  • The smallest change that could potentially be
    detected
  • The smallest change that could reliably be
    detected beyond error
  • The change typically observed in a population
  • The change observed in the subset of the
    population judged to have changed
  • The change seen in those judged to have made an
    important change.

6
Preliminary Decisions(Before We Begin!)
  • What parameter is to be measured? (Pain, QoL,
    etc.)
  • Whose perspective is important the patients,
    the clinicians or societys?
  • What if these conflict?
  • Responsive to what? Differences between groups
    within a group over time, or to compare
    changes over time between two groups?
  • Unit of analysis? (Average scores, or individual
    classification such as a diagnosis?)

7
Approaches to Estimating Responsiveness
1. Theoretical (equivalent to content validity)
2. Empirical Internal evaluation (equivalent to
concurrent validity)
3. Empirical External comparison (equivalent to
criterion validity)
8
1. Modeling Approach
  • Content should reflect the types of change
    expected to occur with the therapy states, not
    traits
  • There should be no floor or ceiling effects
  • Scoring must ensure that change is not diluted in
    other factors that do not vary
  • Scale must have fine enough gradation

9
2. Internal Empirical Approach
  • Apply scale before after calculate effect size
    statistic
  • Because measurement scales vary, results are
    expressed in standard deviation units
    (Mt - Mc)/SDc
  • Effect size comparable to a z score if normal
    distribution, indicates how many percentiles a
    patient will move following treatment.

10
Effect Size Statistics
  • 1. Use a t-test and report statistically
    significant differences as indicators of
    responsiveness
  • 2. Removing the n from the denominator to make
    independent of sample size
  • 3. Denominator can be SD of the baseline scores,
    or of scores among stable subjects, or of change
    in scores.

11
Effect Size Statistics (2)
  • Refinements include correction for level of
    reliability. E.g., Wyrich proposed standard
    error of mean in denominator
  • SEM SD1 (1-")
  • However, a high alpha does not ensure
    responsiveness if the measure includes
    inter-correlated traits that do not change.

12
Effect size
alpha
Impact of including Alpha in Effect Size
Calculation (at difference of 1.5 and SD of 3)
13
Comment Effect Sizes
  • Useful for comparing responsiveness of different
    health measures
  • Helpful in calculating the power of a study
  • However
  • Formulae seem somewhat arbitrary
  • Effect sizes offer no indication of the clinical
    change represented by a given shift in scores

14
The MID as a Criterion
  • Introduces theme of Minimally Important
    Difference (MID) and its cousin, the MCID.
  • MCID The smallest difference in score in the
    domain of interest which patients perceive as
    beneficial and which would mandate, in the
    absence of troublesome side effects and excessive
    cost, a change in the patients management
  • Estimate internally (using scale itself), or
    externally (using some other criterion)

15
Setting Internal MIDs
  • 1. Apply the measurement select change threshold
    seen as important by clinical experts. How much
    would the outcome have to change before they
    would alter treatment?
  • 2. Present clinicians with written scenarios and
    compare each with the previous one. MCID
    average difference in scores between pairs rated
    as a little less or a little more.

16
Externally-Based MIDs
  • Clinicians view patient scenarios and rate
    whether they changed significantly or not.
  • Patients can judge the change in their own
    condition no change, a little better, etc.
  • Alternatively, clinically assess patients, then
    randomly assign pairs of them to hold
    conversations about their illness, leading to
    ratings of whether they were better than the
    other, much better, etc.

17
3. External Criteria for Responsiveness
  • 1. Establish MID or MCID. Group patients who
    improve (or deteriorate) gt MID and compare to
    rest using the measure
  • 2. Various statistics
  • Sensitivity, specificity ROCs
  • Point-biserial correlations
  • Regression to analyse average scale change on the
    measure for each MCID unit change

18
Sensitivity (true positives)
1.0 0.8 0.6 0.4 0.2 0.0
SF-36
AIMS2
HAQ
0.0 0.2 0.4 0.6
0.8 1.0
1-specificity (false positives)
ROC Curve for 3 Instruments in Detecting an MCID
19
Questions for Discussion
  • Are MIDs constant across range? (next slide)
  • How can we encourage people to routinely report
    before and after changes in scores, SD, and
    alpha?
  • Should we apply a measure to standard scenarios
    to get X1 and X2 use this to simulate the
    effect size?
  • How does this all apply to nutritional
    assessments?

20
Large
Size of change
Cognition?
Physical function?
None
Poor
Good
Health status
Notional Size of an MCID at Various Levels of
Overall Health
Write a Comment
User Comments (0)
About PowerShow.com