Title: CONFOUNDING
1Bias, Confounding and Fallacies in Epidemiology
M. Tevfik DORAK http//www.dorak.info/epi
2BIAS Definition Types Examples Remedies CONFOUNDI
NG Definition Examples Remedies FALLACIES Definit
ion (Effect Modification)
3What is Bias?
Bias is one of the three major threats to
internal validity Bias Confounding Random
error / chance
4What is Bias?
Any trend in the collection, analysis,
interpretation, publication or review of data
that can lead to conclusions that are
systematically different from the truth (Last,
2001) A process at any state of inference
tending to produce results that depart
systematically from the true values (Fletcher et
al, 1988) Systematic error in design or conduct
of a study (Szklo et al, 2000)
5Bias is systematic error
Errors can be differential (systematic) or
non-differential (random) Random error use of
invalid outcome measure that equally
misclassifies cases and controls Differential
error use of an invalid measures that
misclassifies cases in one direction and
misclassifies controls in another Term 'bias'
should be reserved for differential or systematic
error
6Random Error
Per Cent
Size of induration (mm)
WHO (www)
7Systematic Error
Per Cent
Size of induration (mm)
WHO (www)
8Chance vs Bias
Chance is caused by random error Bias is caused
by systematic error Errors from chance will
cancel each other out in the long run (large
sample size) Errors from bias will not cancel
each other out whatever the sample size Chance
leads to imprecise results Bias leads to
inaccurate results
9Types of Bias
Selection bias Unrepresentative nature of
sample Information (misclassification)
bias Errors in measurement of exposure of
disease Confounding bias Distortion of exposure
- disease relation by some other factor Types of
bias not mutually exclusive (effect modification
is not bias) This classification is by Miettinen
OS in 1970s See for example Miettinen Cook,
1981 (www)
10Selection Bias
Selective differences between comparison groups
that impacts on relationship between exposure and
outcome Usually results from comparative groups
not coming from the same study base and not being
representative of the populations they come from
11Selection Bias Examples
(www)
12Selection Bias Examples
(www)
13Selection Bias Examples
(www)
14Selection Bias Examples
(www)
15Selection Bias Examples
Selective survival (Neyman's) bias
(www)
16Selection Bias Examples
Case-control study Controls have less potential
for exposure than cases Outcome brain tumour
exposure overhead high voltage power
lines Cases chosen from province wide cancer
registry Controls chosen from rural
areas Systematic differences between cases and
controls
17Case-Control Studies Potential Bias
Schulz Grimes, 2002 (www) (PDF)
18Selection Bias Examples
Cohort study Differential loss to
follow-up Especially problematic in cohort
studies Subjects in follow-up study of multiple
sclerosis may differentially drop out due to
disease severity Differential attrition ?
selection bias
19Selection Bias Examples
Self-selection bias - You want to determine the
prevalence of HIV infection - You ask for
volunteers for testing - You find no HIV - Is it
correct to conclude that there is no HIV in this
location?
20Selection Bias Examples
Healthy worker effect Another form of
self-selection bias self-screening process
people who are unhealthy screen themselves out
of active worker population Example - Course of
recovery from low back injuries in 25-45 year
olds - Data captured on workers compensation
records - But prior to identifying subjects for
study, self-selection has already taken place
21Selection Bias Examples
Diagnostic or workup bias Also occurs before
subjects are identified for study Diagnoses (case
selection) may be influenced by physicians
knowledge of exposure Example - Case control
study outcome is pulmonary disease, exposure is
smoking - Radiologist aware of patients smoking
status when reading x-ray may look more
carefully for abnormalities on x-ray and
differentially select cases Legitimate for
clinical decisions, inconvenient for research
22Types of Bias
Selection bias Unrepresentative nature of
sample Information (misclassification) bias
Errors in measurement of exposure of
disease Confounding bias Distortion of exposure
- disease relation by some other factor Types of
bias not mutually exclusive (effect modification
is not bias)
23Information / Measurement / Misclassification
Bias
Method of gathering information is inappropriate
and yields systematic errors in measurement of
exposures or outcomes If misclassification of
exposure (or disease) is unrelated to disease (or
exposure) then the misclassification is
non-differential If misclassification of
exposure (or disease) is related to disease (or
exposure) then the misclassification is
differential Distorts the true strength of
association
24Information / Measurement / Misclassification
Bias
Sources of information bias Subject
variation Observer variation Deficiency of
tools Technical errors in measurement
25Information / Measurement / Misclassification
Bias
Recall bias Those exposed have a greater
sensitivity for recalling exposure (reduced
specificity) - specifically important in
case-control studies - when exposure history is
obtained retrospectively cases may more closely
scrutinize their past history looking for ways to
explain their illness - controls, not feeling a
burden of disease, may less closely examine their
past history Those who develop a cold are more
likely to identify the exposure than those who do
not differential misclassification - Case
Yes, I was sneezed on - Control No, cant
remember any sneezing
26Information / Measurement / Misclassification
Bias
Reporting bias Individuals with severe disease
tends to have complete records therefore more
complete information about exposures and greater
association found Individuals who are aware of
being participants of a study behave differently
(Hawthorne effect)
27Controlling for Information Bias
- Blinding prevents investigators and
interviewers from knowing case/control or
exposed/non-exposed status of a given
participant - Form of survey mail may impose
less white coat tension than a phone or
face-to-face interview - Questionnaire use
multiple questions that ask same information
acts as a built in double-check - Accuracy
multiple checks in medical records gathering
diagnosis data from multiple sources
28Types of Bias
Selection bias Unrepresentative nature of
sample Information (misclassification)
bias Errors in measurement of exposure of
disease Confounding bias Distortion of
exposure - disease relation by some other
factor Types of bias not mutually
exclusive (effect modification is not bias)
29(www)
30Cases of Down Syndrome by Birth Order
EPIET (www)
31Cases of Down Syndrome by Age Groups
EPIET (www)
32Cases of Down Syndrome by Birth Order and
Maternal Age
EPIET (www)
33Confounding
- A third factor which is related to both exposure
and outcome, and which accounts for some/all of
the observed relationship between the two - Confounder not a result of the exposure
- e.g., association between childs birth rank
(exposure) and Down syndrome (outcome) mothers
age a confounder? - e.g., association between mothers age (exposure)
and Down syndrome (outcome) birth rank a
confounder?
34Confounding
To be a confounding factor, two conditions must
be met
Exposure
Outcome
Third variable
Be associated with exposure - without
being the consequence of exposure
Be associated with outcome -
independently of exposure (not an intermediary)
35Confounding
Birth Order
Down Syndrome
Maternal Age
Maternal age is correlated with birth order and a
risk factor even if birth order is low
36Confounding ?
Down Syndrome
Maternal Age
Birth Order
Birth order is correlated with maternal age but
not a risk factor in younger mothers
37Confounding
Coffee
CHD
Smoking
Smoking is correlated with coffee drinking and a
risk factor even for those who do not drink coffee
38Confounding ?
Smoking
CHD
Coffee
Coffee drinking may be correlated with smoking
but is not a risk factor in non-smokers
39Confounding
Alcohol
Lung Cancer
Smoking
Smoking is correlated with alcohol consumption
and a risk factor even for those who do not drink
alcohol
40Confounding ?
Smoking
CHD
Yellow fingers
Not related to the outcome Not an independent
risk factor
41Confounding ?
Diet
CHD
Cholesterol
On the causal pathway
42Confounding
Imagine you have repeated a positive finding of
birth order association in Down syndrome or
association of coffee drinking with CHD in
another sample. Would you be able to replicate
it? If not why?
Imagine you have included only non-smokers in a
study and examined association of alcohol with
lung cancer. Would you find an association?
Imagine you have stratified your dataset for
smoking status in the alcohol - lung cancer
association study. Would the odds ratios differ
in the two strata?
Imagine you have tried to adjust your alcohol
association for smoking status (in a statistical
model). Would you see an association?
43Confounding
Imagine you have repeated a positive finding of
birth order association in Down syndrome or
association of coffee drinking with CHD in
another sample. Would you be able to replicate
it? If not why?
You would not necessarily be able to replicate
the original finding because it was a spurious
association due to confounding. In another
sample where all mothers are below 30 yr, there
would be no association with birth order. In
another sample in which there are few smokers,
the coffee association with CHD would not be
replicated.
44Confounding
Imagine you have included only non-smokers in a
study and examined association of alcohol with
lung cancer. Would you find an association?
No because the first study was confounded. The
association with alcohol was actually due to
smoking. By restricting the study to non-smokers,
we have found the truth. Restriction is one way
of preventing confounding at the time of study
design.
45Confounding
Imagine you have stratified your dataset for
smoking status in the alcohol - lung cancer
association study. Would the odds ratios differ
in the two strata?
The alcohol association would yield the similar
odds ratio in both strata and would be close to
unity. In confounding, the stratum-specific odds
ratios should be similar and different from the
crude odds ratio by at least 15. Stratification
is one way of identifying confounding at the time
of analysis.
If the stratum-specific odds ratios are
different, then this is not confounding but
effect modification.
46Confounding
Imagine you have tried to adjust your alcohol
association for smoking status (in a statistical
model). Would you see an association?
If the smoking is included in the statistical
model, the alcohol association would lose its
statistical significance. Adjustment by
multivariable modelling is another method to
identify confounders at the time of data
analysis.
47Confounding
For confounding to occur, the confounders should
be differentially represented in the comparison
groups. Randomisation is an attempt to evenly
distribute potential (unknown) confounders in
study groups. It does not guarantee control of
confounding. Matching is another way of
achieving the same. It ensures equal
representation of subjects with known confounders
in study groups. It has to be coupled with
matched analysis. Restriction for potential
confounders in design also prevents confounding
but causes loss of statistical power (instead
stratified analysis may be tried).
48Confounding
Randomisation, matching and restriction can be
tried at the time of designing a study to reduce
the risk of confounding. At the time of
analysis Stratification and multivariable
(adjusted) analysis can achieve the same. It is
preferable to try something at the time of
designing the study.
49Effect of randomisation on outcome of trials in
acute pain
Bandolier Bias Guide (www)
50Confounding
Obesity
Mastitis
Age
In cows, older ones are heavier and older age
increases the risk for mastitis. This association
may appear as an obesity association
51Confounding
If each case is matched with a same-age control,
there will be no association (OR for old age
2.6, P 0.0001)
(www)
52No Confounding
(www)
53Cases of Down Syndrome by Birth Order and
Maternal Age
If each case is matched with a same-age control,
there will be no association. If analysis is
repeated after stratification by age, there will
be no association with birth order.
EPIET (www)
54BIAS Definition Types Examples Remedies CONFOUNDI
NG Definition Examples Remedies (Effect
Modification) FALLACIES Definition
55Confounding or Effect Modification
Birth Weight
Leukaemia
Sex
Can sex be responsible for the birth weight
association in leukaemia? - Is it correlated
with birth weight? - Is it correlated with
leukaemia independently of birth weight? - Is
it on the causal pathway? - Can it be
associated with leukaemia even if birth weight is
low? - Is sex distribution uneven in
comparison groups?
56Confounding or Effect Modification
OR 1.5
Does birth weight association differ in strength
according to sex?
Birth Weight
Leukaemia
BOYS
OR 1.8
Birth Weight
Leukaemia
/ /
GIRLS
OR 0.9
57Effect Modification
In an association study, if the strength of the
association varies over different categories of a
third variable, this is called effect
modification. The third variable is changing the
effect of the exposure. The effect modifier may
be sex, age, an environmental exposure or a
genetic effect. Effect modification is similar
to interaction in statistics. There is no
adjustment for effect modification. Once it is
detected, stratified analysis can be used to
obtain stratum-specific odds ratios.
58- Effect modifier
- Belongs to nature
- Different effects in different strata
- Simple
- Useful
- Increases knowledge of biological mechanism
- Allows targeting of public health action
- Confounding factor
- Belongs to study
- Adjusted OR/RR different from crude OR/RR
- Distortion of effect
- Creates confusion in data
- Prevent (design)
- Control (analysis)
59BIAS Definition Types Examples Remedies CONFOUNDI
NG Definition Examples Remedies (Effect
Modification) FALLACIES Definition
60Fallacies
HISTORICAL FALLACY ECOLOGICAL FALLACY (Cross-Leve
l Bias) BERKSON'S FALLACY (Selection Bias in
Hospital-Based CC Studies) HAWTHORNE EFFECT
(Participant Bias) REGRESSION TO THE MEAN
(Davis, 1976) (Information Bias)
61HOW TO CONTROL FOR CONFOUNDERS?
- IN STUDY DESIGN
- RESTRICTION of subjects according to potential
confounders (i.e. simply dont include confounder
in study) - RANDOM ALLOCATION of subjects to study groups to
attempt to even out unknown confounders - MATCHING subjects on potential confounder thus
assuring even distribution among study groups
62HOW TO CONTROL FOR CONFOUNDERS?
- IN DATA ANALYSIS
- STRATIFIED ANALYSIS using the Mantel Haenszel
method to adjust for confounders - IMPLEMENT A MATCHED-DESIGN after you have
collected data (frequency or group) - RESTRICTION is still possible at the analysis
stage but it means throwing away data - MODEL FITTING using regression techniques
63Effect of blinding on outcome of trials of
acupuncture for chronic back pain
Bandolier Bias Guide (www)
64WILL ROGERS' PHENOMENON Assume that you are
tabulating survival for patients with a certain
type of tumour. You separately track survival of
patients whose cancer has metastasized and
survival of patients whose cancer remains
localized. As you would expect, average survival
is longer for the patients without metastases.
Now a fancier scanner becomes available, making
it possible to detect metastases earlier. What
happens to the survival of patients in the two
groups? The group of patients without
metastases is now smaller. The patients who are
removed from the group are those with small
metastases that could not have been detected
without the new technology. These patients tend
to die sooner than the patients without
detectable metastases. By taking away these
patients, the average survival of the patients
remaining in the "no metastases" group will
improve. What about the other group? The group
of patients with metastases is now larger. The
additional patients, however, are those with
small metastases. These patients tend to live
longer than patients with larger metastases. Thus
the average survival of all patients in the
"with-metastases" group will improve. Changing
the diagnostic method paradoxically increased the
average survival of both groups! This paradox is
called the Will Rogers' phenomenon after a quote
from the humorist Will Rogers ("When the Okies
left California and went to Oklahoma, they raised
the average intelligence in both states").
(www)
See also Festenstein, 1985 (www)
65Cause-and-Effect Relationship
Grimes Schulz, 2002 (www) (PDF)
66http//www.dorak.info
67M. Tevfik DORAK Paediatric Lifecourse
Epidemiology Research Group School of Clinical
Medical Sciences (Child Health) Newcastle
University England, U.K. http//www.dorak.info