Title: Testretest reliability of health vignette ratings: evidence from SHARECOMPARE
1Test-retest reliability of health vignette
ratings evidence from SHARE/COMPARE
- Hendrik Jürges
- Universität Mannheim
2Introduction
- Anchoring vignettes have so far been used only in
cross-sectional settings - Longitudinal data can be created by combining
COMPARE vignettes with SHARE 2004 vignettes - Longitudinal data on vignettes allow tackling
important questions surrounding vignettes that
have not been answered before
3Research questions
- Do vignette ratings change as indviduals age,
e.g. do respondents become more lenient towards
health? - separate cohort and age effects
- Vignettes are noisy measures of response styles
- How noisy are vignettes on the individual level?
- How noisy are vignettes on the aggregate (e.g.
country) level?
4Measurement error framework
Measured Response Style 2004 (V1)
Error 2004
True Response Style (T)
Measured Response Style 2006 (V2)
Error 2006
Corr(V1,V2) estimates Var(T)/(Var(V)
5COMPARE Vignettes in SHARE Waves 1 and 2
- Pain Paul has a headache once a month that is
relieved after taking a pill. During the headache
he can carry on with his day-to-day affairs.In
your opinion, how much of bodily aches or pains
does Paul have?Self-rating overall in the last
30 days, how much of bodily aches or pains did
you have? - Other domains Sleep, Mobility, Memory,
Breathing, Depression
6Sample
7Aggregate health changes between the 2004 and
2006 waves bodily aches or pains
8Aggregate vignette changes between the 2004 and
2006 waves bodily aches or pains
9Response changes between the 2004 and 2006 waves
bodily aches or pains
10Response changes between the 2004 and 2006 waves
bodily aches or pains
11Summary of aggregate trends
12Summary of aggregate findings
- Individuals report worse health as they age
(except depression symptoms) - In parallel, individuals become more lenient
towards health - Changes in self-ratings of health tend to
understimate true changes - Measures of health change correcting for response
styles have a larger variance than uncorrected
measures
13Individual-level correlations
Vignettes 2006 Self-ratings 2006
Vignettes 2004 Self-ratings 2004
Self-ratings 2004 Vignettes 2004
Self-ratings 2006 Vignettes 2006
14Individual-level reliability summary
15Illustrating country-level reliability pain
16Country-level reliability summary
17Summary
- Individual-level vignette ratings are only weakly
correlated over time - Only 15 to 25 percent of individual variation can
be explained by response styles - The rest of the variance must be attributed to
unobserved individual and random components - Aggregated ratings (e.g. on the country level)
are more reliable (as measurement errors are
averaged out)
18Limitations
- In 2006, respondents could remember having been
given the same vignettes before and remember
their answers. This would bias our results
towards finding a higher reliability of vignettes - Vignettes were asked in slightly different
contexts (in 2004 with 2 more on the same domain) - Only one vignette per domain allows for large
effect of measurement error
19Discussion
- Vignettes introduce another source of error into
measurement - Trade-off between biased measured (without
vignettes) and measures with larger variance
(with vignettes) - Results warn against naive application of
vignette methods at low levels of aggregation - Findings call for
- large "enough" samples when comparing different
groups - large "enough" number of vignettes per domain
20BACKUP SLIDES
21Aggregate health changes between the 2004 and
2006 waves bodily aches or pains
22Aggregate vignette changes between the 2004 and
2006 waves bodily aches or pains
23Individual health changes between the 2004 and
2006 waves bodily aches or pains
24Individual vignette changes between the 2004 and
2006 waves bodily aches or pains
25Illustrating individual-level reliability
bodily aches or pains