Title: Basic Study Designs in Observational Epidemiology
1- Basic Study Designs in Observational Epidemiology
2Epidemiologic reasoning
- To determine whether a statistical association
exists between a presumed risk factor and disease - Risk Factor Antecedents of adverse health
outcomes that remain associated with the outcomes
after adjustments for measured potential
confounders (Greenland et al, Epidemiology
200415529-535)
3Epidemiologic reasoning
- To determine whether a statistical association
exists between a presumed risk factor and disease - To derive inferences regarding a possible causal
relationship from the patterns of the statistical
associations
4To determine whether a statistical association
exists between a presumed risk factor and a
disease
- Observational studies using populations or groups
of individuals as units of observation - Descriptive studies (prevalence, incidence,
trends) - Analysis of birth cohorts (cohort, age, period
effects) - Ecological studies
- Observational studies using individuals as units
of observation - Cohort studies
- Case-control studies
- Cross-sectional studies
- Other (nested case-control, case-crossover study)
5Studies using groups as units of observation
- ANALYSIS OF BIRTH COHORTS
- Cohort-, age-, period-effects
- Classic example
- WH Frost The age selection of mortality from
tuberculosis in successive decades. Am J Hyg
19393091-6.
61995 cross-sectional survey
Cross-sectional surveys
750
40
1960
Born in
30
Prevalence (per 1000)
20
10
0
0
10
20
30
40
50
60
70
80
Age (years)
8- Age effect Change in the rate according to age,
irrespective of birth cohort and calendar time - Cohort effect Change in the rate according to
year of birth, irrespective of age and calendar
time - Period effect Change in the rate affecting an
entire population at some point in time,
irrespective of birth cohort and age.
9Cohort effect rates are changing from cohort to
cohort regardless of age
Age effect rates vary by age in each cohort
50
40
1960
Born in
30
Prevalence (per 1000)
20
10
0
0
10
20
30
40
50
60
70
80
Age (years)
10Period effect Event in 1945 changes all cohorts
in the corresponding ages
Birth cohorts
50
Born in
40
1920
1910
30
1900
Prevalence (per 1000)
1890
20
1880
10
0
0
10
20
30
40
50
60
70
80
Age (years)
11Studies using groups as units of observation
- ECOLOGIC STUDIES
- To assess the correlation between a presumed risk
factor and an outcome, mean values of the outcome
(e.g., rate, mean) are plotted against mean
values of the factor (e.g., average per capita
fat intake), using groups as units of observation
- Groups can be defined by place (geographical
comparisons) or time (temporal trends).
12A plot of the population of Oldenburg at the end
of each year against the number of storks
observed in that year, 1930-1936. Ornitholigische
Monatsberichte 193644(2)
13(No Transcript)
14Relation between sodium (Na) excretion and age
increase in systolic blood pressure (SBP) in
centers in the INTERSALT cohort
Elliot, in Marmot and Elliot (eds.) Coronary
Heart Disease Epidemiology, Oxford, 1992,
pp.166-78.
15Ecological fallacy The bias that may occur
because an association observed between variables
on aggregate levels does no necessarily represent
the association that exists at the individual
level. Last Dictionary of Epidemiology, 1995
16Example of ecological bias
Traffic injuries 4/747
Based on Diez-Roux, Am J Public Health
199888216.
17Higher income is associated with higher injury
rate
18Example of ecological bias
Traffic injuries 4/747
Based on Diez-Roux, Am J Public Health
199888216.
19Higher income is associated with higher injury
rate
Injury cases have lower mean income than non cases
20- Which of the two levels of inference is wrong?
- Concluding that high income is a risk factor for
injuries (based on the ecologic data) is subject
to ecologic fallacy. - BUT concluding that, because injury cases tend
to have lower income, communities with higher
average income should have lower injury rates is
also wrong!
- The real problem is cross-level reference
- Using ecologic data to make inference at the
individual level (ecologic fallacy). - Or using the individual data to make inferences
at the group (population level).
- When used to make inferences at the proper level,
both approaches might be right.
Morgenstern Ann Rev Public Health 19951661-81.
21Types of ecologic variables
- Analogs of individual-level characteristics
- Aggregate measures (proportion, mean)
- Prevalence of disease
- Mean saturated fat intake
- Environmental measures
- Air pollution
- Global measures
- Health care system
- Gun control law
- Herd immunity
22Ecologic studies are the design of choice in
certain situations
- When the level of inference of interest is at the
population level - Food availability (e.g., Goldberger et al Public
Health Rep 1916352673-714). - SES inequality and health
- Effects of tax hikes in cigarette sales
- When the variability of exposure within the
population is limited - Salt intake and hypertension (Elliot, 1992)
- Fat intake and breast cancer (Wynder et al, 1997)
23Hypothetical data on individuals from a
World-wide population
Systolic blood pressure (mm Hg)
Usual daily salt intake
24Hypothetical data on individuals from a
World-wide population
Systolic blood pressure (mm Hg)
Usual daily salt intake
25Hypothetical data on individuals from a
World-wide population
Systolic blood pressure (mm Hg)
Usual daily salt intake
26Hypothetical ecologic data from 7 countries
Mean systolic blood pressure (mm Hg)
Mean usual daily salt intake
27Relation between sodium (Na) excretion and age
increase in systolic blood pressure (SBP) in
centers in the INTERSALT cohort
Elliot, in Marmot and Elliot (eds.) Coronary
Heart Disease Epidemiology, Oxford, 1992,
pp.166-78.
28Studies based on individuals1.- Cohort studies
29Studies based on individuals1.- Cohort studies
30Cohort study
Losses to follow-up
Events
Final pop
31Studies based on individuals2.- Case-control
studies
32Case-control study
Losses
Hypothetical pop
33Case-control study
Losses
Recruiting only cases with longest survival
(Prevalent cases) Risk of duration
(incidence-prevalence) bias
34Cross-sectional study
Snapshot of prevalent cases/non-cases
35(Incidence density sampling)
36Example of nested case-control study US Air Force
Cohort Study (Grayson, Am J Epidemiol
1996143480-6) Cohort 880,000 male members of
US Air Force employed for at least one year
between 1970-89 (variable length of
follow-up). Cases 230 newly developed cases
of malignant brain tumor 1970-89 Controls 920
non-case employees, 4 for each cases risk-set,
matched by age, race, and length of follow-up.
37Age-race-senior military rank-adjusted odds
ratios in brain tumor cases and controls without
brain tumors, according to exposure to very low
frequency electromagnetic fields or to
radiofrequency/microwave (Grayson JK, Am J
Epidemiol 1996143480-486)
Example power general specialists,
telecommunications equipment repair men Above
permissible exposure limits (10 mW/cm2)
38Case-cohort study
Initial pop
39Example of case-cohort study Association between
CMV antibodies and incident coronary heart
disease (CHD) in the Atherosclerosis Risk in
Communities (ARIC) Study (Sorlie et al Arch
Intern Med 20001602027-32) Cohort 14,170
adult individuals (45-64 yrs at baseline) from 4
US communities (Jackson, Miss Minneapolis, MN,
Forsyth Co NC Washington Co, MD), free of CHD at
baseline. Followed-up for up to 5 years.
Cases 221 incident CHD cases Controls Random
sample from baseline cohort, n515 (included 10
subsequent cases). The population with the
highest antibody levels of CMV (approximately the
upper 20) showed an increased relative risk (RR)
of CHD of 1.76 (95 confidence interval,
1.00-3.11), adjusting for age, sex, and race.
40Case-cohort study
N?14,000
Option 1 thaw serum samples of 14,000 persons,
classify by CMV titer () or (-), and follow- up
to calculate incidence in each group (exposed vs.
unexposed)
Option 2 Case-cohort study
Initial pop
41- When are nested designs (case-cohort or nested
case-control) the best choice? - In well defined cohorts when additional
(expensive or burdensome) information needs to be
collected - Laboratory determination in samples from specimen
repository (e.g., serum bank). - Additional record abstraction (e.g., medical,
occupational records). - Analytical techniques (analogous to methods used
in cohort studies, matched case-control studies)
are available.
42A few notes on Matching
- Most frequently used in case-control studies
- Frequency vs. individual matching
- Advantages
- Intuitive, easy to explain
- Guarantees certain degree of comparability in
small studies - Efficient (if matching on a strong confounder)
- Disadvantages
- Costly, usually logistically complicated
- Inefficient (if matching on a weak confounder)
- Questionable representativiness of control group
(limits its use for other case-control
comparisons) - Cannot study the matching variable (and additive
interaction) - Possibility of residual confounding
43A special type of case-control study the
case-crossover study
- Useful when exposures that vary over time can
precipitate acute events, such as sudden cardiac
deaths, asthma episodes, etc. - Cases serve as their own controls The subjects
time of event of interest (e.g., death) is the
case period, and the subjects other times
comprise the control period
- Advantages
- Each participant is considered a matched stratum
in a case-control study (self-matching) where
cases and controls are case and control times
(no control selection bias) - Controls for confounding by unchanged variables
(sex, genetic factors, mental health, etc.)
- Disadvantages
- Assumes no carry over (cumulative) effect of
exposure of interest - No confounding or interaction by time-related
variables (e.g., ambient temperature, day of the
week) - Challenges
- Lag time must be taken into account (relevant
exposure period)
44A special type of case-control study the
case-crossover study Example Valent et al,
Pediatrics 2001107e23
- Objective to evaluate the association between
sleep (and wakefulness) duration and childhood
unintentional injury - 292 unintentionally injured children
- Case period 24 hours preceding injury
- Control period 25-48 hours preceding injury
- Analysis matched-pair and conditional logistic
regression - Adjustment for day of the week (week-end vs.
weekday) and activity risk level (higher vs.
lower level of energy)
45Odds Ratios and 95 CIs for Sleeping Less than 10
Hours a Day
(Valent et al, Pediatrics 2001107e23)
- Case period 24 hours preceding injury
- Control period 25-48 hours preceding injury
46Threats of Validity in Case-Crossover
Studies(Maclure M, Am J Epidemiol
1991133144-53)
- Within-individual confounding
- No confounding by the individuals
characteristics that remain constant, but there
can be confounding by variables that vary over
time. - Example A person who drinks coffee only after an
anger outburst - Selection bias
- No bias in selection of control periods
- Biased case-period selection is possible
- Case-crossover study of incident nonfatal
myocardial infarction and anger episode (Moller
et al, Psychosom Med 199961842-9) - Survival bias implies that if cases being
exposed to anger have a better prognosis for
surviving MI than those not exposed to anger, a
study of only nonfatal cases would overestimate
the relative risk of MI. Likewise, if cases
exposed to anger right before their MI are less
inclined to participate, this would result in an
underestimation.
47Threats of Validity in Case-Crossover
Studies(cont.) (Maclure M, Am J Epidemiol
1991133144-53)
- Information bias
- When interviews are done at the time of the
event, quality of the information obtained from
the case (or a proxy) about the case (hazard)
period may differ from that about the control
period (e.g., when the case period is the 24-hr
period preceding the event, and the control
period is the 25 to 48-hour preceding the event) - Bias can go in either direction
- Faulty memory regarding the control period
- Exaggeration or denial of exposure in the case
period -
- External validity
- In principle, generalizable to all acute-onset
outcomes hypothesized to be caused by brief
exposures with transient effects. (Maclure M)