Measurement Issues in Health Disparities Research - PowerPoint PPT Presentation

1 / 79
About This Presentation
Title:

Measurement Issues in Health Disparities Research

Description:

Health disparities research focuses on differences in health ... Cella D et al. Med Care 1998: 36;1407. 35. Psychometric Adequacy in One Group. Conceptual ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 80
Provided by: ucsf4
Category:

less

Transcript and Presenter's Notes

Title: Measurement Issues in Health Disparities Research


1
Measurement Issues in Health Disparities
Research
  • Anita L. Stewart, Ph.D.
  • University of California, San Francisco
  • Clinical Research with Diverse Communities
  • EPI 222, Spring
  • April 17, 2008

2
Background
  • U.S. population becoming more diverse
  • More minority groups are being included in
    research due to
  • NIH mandate
  • Recent health disparities initiatives

3
Types of Diverse Groups
  • Health disparities research focuses on
    differences in health between the following
    groups
  • Minority vs. non-minority
  • Low income vs. others
  • Low education vs. others
  • Limited English skills vs. others
  • . and others

4
Health Disparities Research
  • Describe health disparities
  • Health differences across various diverse groups
  • Identify mechanisms by which health disparities
    occur
  • Individual level
  • Environmental level
  • Intervene to reduce health disparities

5
Health Care Disparities
  • Differential access to and quality of health care
    is well known
  • Thus, health care disparities become a plausible
    mechanism for health disparities
  • Understanding determinants of health care
    disparities is also of interest

6
Types of Self-Report Measures Needed
  • Measures of health, and of various mechanisms for
    disparities
  • Class 4 will present numerous mechanisms
  • Examples from this class sense of control,
    self-efficacy for managing disease,
    health-related quality of life for various health
    conditions

7
Measurement Implications of Research in Diverse
Groups
  • Most self-reported measures were developed and
    tested in mainstream, well-educated groups
  • Subgroup analysis of measures has been rare
  • Thus, little information is available on
    appropriateness, reliability, validity, and
    responsiveness in minority and other diverse
    groups

8
The Measurement Goal Identify Measures That Can
Be Used
  • To compare diverse groups
  • To study mechanisms within any particular
    diverse group

9
Group Comparisons are the Most Problematic
  • Disparities research involves comparing mean
    levels of health or its determinants
  • Requires equivalent concepts and measures
  • Potential true differences may be obscured
  • Observed group differences may be inaccurate

10
Alternative Explanations for Observed Group
Differences
  • Observed group mean differences in a measure can
    be due to
  • culturally- or group-mediated differences in true
    score (true differences) -- OR --
  • bias - systematic differences between group
    observed scores not attributable to true scores

11
Bias - A Special Concern
  • Measurement bias in any one group may make group
    comparisons invalid
  • Bias can be due to group differences in
  • the meaning of concepts or items
  • the extent to which a measure represents a
    concept
  • cognitive processes of responding
  • use of response scales
  • appropriateness of data collection methods

12
Example of Effect of Biased Items
  • 5 CES-D items administered to black and white men
  • 1 item subject to differential item functioning
    (bias)
  • 5-item scale including item suggested that black
    men had higher levels of somatic symptoms than
    white men (p lt .01)
  • 4-item scale excluding biased item showed no
    differences between black and white men

S Gregorich, Med Care, 200644S78-S94.
13
Bias or Systematic Difference?
  • Bias refers to deviation from true score
  • Cannot speak of a measure being biased in one
    group compared to another w/o knowing true score
  • Preferred term differential item functioning
  • Item (or measure) that has a different meaning in
    one group than another

14
Typical Sequence of Developing New Self-Report
Measures
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
15
Typical Sequence of Developing New Self-Report
Measures
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
16
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
17
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
Pretest/revise
Field survey
Psychometric analyses
Final measures
18
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
Psychometric analyses
Final measures
19
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
Psychometric analyses
Final measures
20
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
Measurement studies across groups
Psychometric analyses
Final measures
21
Extra Steps in Sequence of Developing New
Self-Report Measures for Diverse Groups
Obtain perspectives of diverse groups
Develop concept
Create item pool
.. to reflect these perspectives
.. in all diverse groups
Pretest/revise
Field survey
.. in all diverse groups
If results are non-equivalent
Psychometric analyses
Final measures
22
Measurement Adequacy vs. Measurement Equivalence
  • Making group comparisons requires conceptual and
    psychometric adequacy and equivalence
  • Adequacy - within a diverse group
  • concepts are appropriate
  • psychometric properties meet minimal criteria
  • Equivalence - between diverse groups
  • conceptual and psychometric properties are
    comparable

23
Why Not Use Culture-Specific Measures?
  • Measurement goal - identify measures that can be
    used across all groups, yet maintain sensitivity
    to diversity and have minimal bias
  • Most health disparities studies require comparing
    mean scores across diverse groups
  • need comparable measures

24
Conceptual and Psychometric Adequacy and
Equivalence
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
25
Left Side of Matrix Issues in a Single Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
26
Ride Side of Matrix Issues in More Than One Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
27
Conceptual Adequacy in One Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
28
Conceptual Adequacy in One Group
  • Is concept relevant, meaningful, and acceptable
    to a diverse group?
  • Traditional research
  • Conceptual adequacy simply defining a concept
  • Mainstream population assumed
  • Minority and health disparities research
  • Mainstream concepts may be inadequate
  • Concept should correspond to how a particular
    group thinks about it

29
Qualitative Approaches to Explore Conceptual
Adequacy in Diverse Groups
  • Literature reviews
  • ethnographic and anthropological
  • In-depth interviews and focus groups
  • discuss concepts, obtain their views
  • Expert consultation from diverse groups
  • review concept definitions
  • rate relevance of items

30
Example of Inadequate Concept
  • Patient satisfaction typically conceptualized in
    mainstream populations in terms of, e.g.,
  • access, technical care, communication,
    continuity, interpersonal style
  • In minority and low income groups, additional
    relevant domains include, e.g.,
  • discrimination by health professionals
  • sensitivity to language barriers

31
Method for Examining Conceptual Relevance
  • Compiled set of 33 HRQL items spanning many
    concepts
  • Assessed relevance to older African Americans
  • After answering each question, asked how
    relevant is this question to the way you think
    about your health?
  • Response scale 0-10 scale with endpoints labeled
  • Labels 0not at all relevant, 10extremely
    relevant

Cunningham WE et al., Qual Life Res,
19998749-768.
32
Results Conceptual Relevance
  • Most relevant items
  • Spirituality (3 items)
  • Weight-related health (2 items)
  • Hopefulness (1 item)
  • Spirituality items
  • importance of spirituality to well-being, level
    of spirituality, being sick affected spirituality

33
Results Conceptual Relevance
  • Least relevant items
  • Physical functioning
  • Role limitations due to emotional problems
  • All standard MOS measures ranked in the lower
    2/3, including all SF12 items

34
Conceptual Relevance of Spanish FACT-G
  • Bilingual/bicultural expert panel reviewed all 28
    items for relevance
  • One item had low cultural relevance to quality of
    life
  • One concept was missing spirituality
  • Developed new spirituality scale (FACIT-Sp) with
    input from cancer patients, psychotherapists, and
    religious experts
  • Sample item I worry about dying

Cella D et al. Med Care 1998 361407
35
Psychometric Adequacy in One Group
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
36
Psychometric Adequacy in any Group
  • Minimal standards
  • Sufficient variability
  • Minimal missing data
  • Adequate reliability/reproducibility
  • Evidence of construct validity
  • Evidence of responsiveness to change
  • Basic classical test theory approach

37
Evidence of Psychometric Inadequacy of Measure in
Various Diverse Groups
  • SF-36 social functioning scale - internal
    consistency reliability lt .70 in three different
    samples
  • Chinese language, adults aged 55-96 years
  • Japanese language, Japanese elders
  • English, Pima Indians

Stewart AL Nápoles-Springer A, Med Care,
200038(9 Suppl)II-102
38
Conceptual Equivalence Across Groups
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
39
Conceptual Equivalence
  • Is the concept relevant, familiar, acceptable to
    all diverse groups being studied?
  • Is the concept defined the same way in all
    groups?
  • all relevant domains included (none missing)
  • interpreted similarly
  • Is the concept appropriate for all diverse groups?

40
Generic/Universal vs Group-Specific(Etic versus
Emic)
  • Concepts unlikely to be defined exactly the same
    way across diverse ethnic groups
  • Generic/universal (etic)
  • features of a concept that are appropriate across
    groups
  • Group-Specific (emic)
  • idiosyncratic or culture-specific portions of a
    concept

41
Etic versus Emic (cont.)
  • Goal in health disparities research on more than
    one group
  • identify generic/universal portion of a concept
    (could be entire concept) that can be applied
    across all groups
  • For within-group analyses or studies
  • the culture-specific portion is also relevant
  • Same as examining conceptual adequacy within
    one group

42
Approaches Similar to Those for Conceptual
Adequacy
  • Main difference Need to assure concept is
    equivalent across groups
  • Additional criterion
  • What do we mean by equivalent conceptually?
  • Methods are poorly developed

43
Obtain Perspective of All Diverse Groups on
Concept
Obtain perspectives of diverse groups
Develop concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
44
Example Develop Concept of Interpersonal
Processes of Care
  • Began with conceptual framework from literature
    and psychometric studies of preliminary survey
  • IPC Version I Conceptual Framework
  • Three major multi-dimensional categories
  • Communication
  • Decision-making
  • Interpersonal Style

45
IPC Version I Subdomains
  • Communication
  • Elicitation of concerns, explanations,
    general clarity
  • Decision-making
  • Involving patients in decisions
  • Interpersonal Style
  • Respectfulness, emotional support,
    non-discrimination, cultural sensitivity

Stewart et al., Milbank Quarterly, 1999 77305
46
Limitations of First IPC Framework
  • Tested on small sample of 600 patients from San
    Francisco General Hospital
  • Several hypothesized concepts were not confirmed
  • e.g., cultural sensitivity
  • Needed further development and validation on a
    larger sample

47
Developed Revised IPC Concept
IPC Version I frameworkin Milbank Quarterly
Draft IPC II conceptual framework
19 focus groups -African American, Latino,and
White adults
Literature review of quality of care in diverse
groups
48
IPC-II Conceptual Framework
I. COMMUNICATION III. INTERPERSONAL
STYLE General clarity
Respectfulness Elicitation/responsiveness
Courteousness Explanations of
Perceived discrimination --processes,
condition, Emotional support
self-care, meds Cultural sensitivity
Empowerment II. DECISION
MAKING Responsive to patient
preferences Consider ability to
comply
49
IPC-II Conceptual Framework (cont)
IV. OFFICE STAFF Respectfulness
Discrimination V. FOR LIMITED
ENGLISH PROFICIENCY PATIENTS MDs and
office staffs sensitivity to language
50
Psychometric Equivalence
Conceptual
Concept equivalent across groups
Concept meaningful within one group
Adequacyin 1 Group
Equivalence Across Groups
Psychometric properties meet minimal
standards within one group
Psychometric properties invariant
(equivalent) across groups
Psychometric
51
Psychometric Equivalence
  • Measures have similar measurement properties in
    all diverse group of interest in your study
  • e.g., English and Spanish language, African
    Americans and Caucasians
  • Measures have similar measurement properties in
    one diverse group as in original (mainstream)
    groups on which the measures were developed

52
Equivalence of Reliability?? No!
  • Difficult to compare reliability because it
    depends on the distribution of the construct in a
    sample
  • Thus lower reliability in one group may simply
    reflect poorer variability
  • More important is the adequacy of the reliability
    in both groups
  • Reliability meets minimal criteria within each
    group

53
Equivalence of Criterion Validity
  • Determine if hypothesized patterns of
    associations with specified criteria are
    confirmed in both groups, e.g.
  • a measure predicts utilization in both groups
  • a cutpoint on a screening measure has the same
    specificity and sensitivity in both groups

54
Equivalence of Construct Validity
  • Are hypothesized patterns of associations
    confirmed in both groups?
  • Example Scores on the Spanish version of the
    FACT had similar relationships with other health
    measures as scores on the English version
  • Primarily tested through subjectively examining
    pattern of correlations
  • Can test differences using confirmatory factor
    analysis (e.g., through Structural Equation
    Modeling)

55
Item Equivalence
  • Differential Item Functioning (DIF)
  • Items are non-equivalent if they are
    differentially related to the underlying trait
  • Equivalence indicated by no DIF
  • Meaning of response categories is similar across
    groups
  • Distance between response categories is similar
    across groups

56
Equivalence of Response Choices Spanish and
English Self-rated Health
  • Excellent
  • Very good
  • Good
  • Fair
  • Poor
  • Excelente
  • Muy buena
  • Buena
  • Regular
  • Mala

Regular in Spanish may be closer to good in
English, thus is not comparable to the meaning of
fair
57
Spanish and English Self-rated Health Responses
  • Excellent
  • Very good
  • Good
  • Fair
  • Poor
  • Excelente
  • Muy buena
  • Buena
  • Regular (Pasable?)
  • Mala

Another choice, pasable, may be closerin
meaning to fair
58
Methods for Identifying Differential Item
Functioning (DIF)
  • Item Response Theory (IRT)
  • Examines each item in relation to underlying
    latent trait
  • Tests if responses to one item predict the
    underlying latent score similarly in two groups
  • if not, items have differential item functioning

59
Equivalence of Factor Structure
  • Factor structure is similar in new group to
    structure in original groups in which measure was
    tested
  • In other words, the measurement model is the same
    across groups
  • Methods
  • Specify the number of factors you are looking for
  • Determine if the hypothesized model fits the data

60
Confirmatory Factor Analysis (CFA)
  • Can specify a hypothesized structure a priori
  • Can test mean and covariance structures
  • to estimate bias

61
Equivalence of Factor Structure Testing
Psychometric Invariance
  • Psychometric invariance (equivalence)
  • Important properties of theoretically-based
    factor structure (measurement model) do not vary
    across groups (are invariant)
  • measurement model is the same across groups
  • Empirical comparison across groups
  • Not simply by examination

62
Criteria for Psychometric Invariance
  • Across all groups a sequential process
  • Same number of factors or dimensions
  • Same items on same factors
  • Same factor loadings
  • No bias on any item across groups
  • Same residuals on items
  • No item or scale bias AND same residuals

63
Criteria for Evaluating Invariance Across Groups
Technical Terms
Dimensional Invariance Same number of factors
Configural Invariance Same items load on same
factors
Metric or Factor Pattern Invariance Items have
same loadings on same factors
Scalar or Strong Factorial Invariance Observed
scores are unbiased
Residual Invariance Observed item and factor
variances are unbiased
Strict Factorial Invariance Both scalar and
residual criteria are met
64
Interpersonal Processes of Care (IPC)
  • Social-psychological aspects of the
    patient-physician interaction
  • communication, respectfulness, patient-centered
    decision-making, and being sensitive to patients
    needs
  • Developed survey of 92 items based on principles
    outlined above

65
Conducted Survey
  • From over 16,000 primary care patients, randomly
    sampled those who
  • Made at least one visit in prior 12 months
  • Records indicated they were African American,
    Latino, or White (Caucasian)
  • Sampled within race/ethnic group

66
Sample Size (N1,664)
  • 383 Spanish speaking Latino
  • 435 African American
  • 428 English speaking Latino
  • 418 White

67
Results
  • Of the 92 items, 29 had similar factor structure
    across all 4 groups
  • achieved metric invariance

68
Results Metric Invariance Across 4 Groups for 29
Items
Dimensional Invariance Same number of factors
Configural Invariance Same items load on same
factors
Metric or Factor Pattern Invariance Items have
same loadings on same factors
Strong Factorial or ScalarInvariance Observed
scores are unbiased
Residual Invariance Observed item and factor
variances can be compared across groups
Strict Factorial Invariance Both scalar
invariance and residual invariance criteria are
met
69
Seven Metric Invariant Scales (29 items)
I. COMMUNICATION Hurried
communication Elicited concerns,
responded Explained results, medications
II. DECISION MAKING
Patient-centered decision-making III.
INTERPERSONAL STYLE Compassionate,
respectful Discriminated
Disrespectful office staff
70
Continued Exploration of Invariance Item Bias
  • Tested invariance of model parameter estimates
    across groups for scalar invariance
  • Bias in items

71
Obtained Partial Scalar Invariance Across 4
Groups for 18 Items
Dimensional Invariance Same number of factors
Configural Invariance Same items load on same
factors
Metric or Factor Pattern Invariance Items have
same loadings on same factors
Strong Factorial or ScalarInvariance Observed
scores are unbiased
Residual Invariance Observed item and factor
variances can be compared across groups
Strict Factorial Invariance Both scalar
invariance and residual invariance criteria are
met
72
Seven Scalar Invariant (Unbiased) Scales (18
items)
I. COMMUNICATION Hurried communication
lack of clarity Elicited concerns,
responded Explained results, medications
explained results II. DECISION MAKING
Patient-centered decision-making decided
together III. INTERPERSONAL STYLE
Compassionate, respectful(subset) compassionate,
respectful Discriminated discriminated
due to race/ethnicity Disrespectful office
staff
73
What to do if Measures Are Not Equivalent in a
Specific Study Comparing Groups
  • Need guidelines for how to handle data when
    substantial non-comparability is found in a study
  • Drop bad or biased items from scores
  • Compare results with and without biased items
  • Analyze study by stratifying diverse groups
  • The current challenge for measurement in minority
    health studies

74
Example 20-item Spanish CES-D in Older Latinos
  • 2 items had very low item-scale correlations,
    high rates of missing data in two studies
  • I felt hopeful about the future
  • I felt I was just as good as other people
  • 20-item version Study 1 Study 2
  • Item-scale correlations -.20 to .73 .05 to
    .78
  • Cronbachs alpha
  • 18-item version
  • Item-scale correlations .45 to .76 .33 to
    .79

75
Example Measure Can be Modified
  • GHAA Consumer Satisfaction Survey
  • Adapted to be appropriate for African American
    patients
  • Focus groups conducted to obtain perspectives of
    African Americans
  • New domains added (e.g., discrimination/
    stereotyping)
  • New items added to existing domains

Fongwa M et al. Ethnicity and Disease,
200616948-955.
76
Approaches to Conducting Studies When You Are Not
Sure
  • Use a combination of universal and group-
    specific items
  • use universal items to compare across groups
  • use specific items (added onto universal items)
    when conducting analyses within one group
  • To find a variable that correlates with a health
    measure within one group

77
Conclusions
  • Measurement in health disparities and minority
    health research is a relatively new field - few
    guidelines
  • Encourage first steps - test and report adequacy
    and equivalence
  • As evidence grows, concepts and measures that
    work better across diverse groups will be
    identified

78
Two Special Journal Issues on Measurement in
Diverse Populations
  • Measurement in older ethnically diverse
    populations
  • J Mental Health Aging, Vol 7, Spring 2001
  • Measurement in a multi-ethnic society
  • Med Care, Vol 44, November 2006

79
Homework for Class 3
  • Using the same template and measure you reviewed
    for class 2, complete sections 14-21
  • Use same file and submit entire document by email
    to anita.stewart_at_ucsf.edu
Write a Comment
User Comments (0)
About PowerShow.com