Title: HSS4303B
1HSS4303B Intro to Epidemiology
- Mar 8, 2010 Matched Studies
2Summary from Last Time
- Case control study design
- Sources of cases and controls
- Problems in selection of controls
- Practical and conceptual problems
- Matching
- Recall problems
- Limitation in recall
- Recall bias
- Multiple controls
- Type of case control studies
- Nested case control study
- Prevalence study
3Summary of related studies
Table 10-10. Finding Your Way in the Terminology Jungle
Case-control study Retrospective study
Cohort study Longitudinal study Prospective study
Prospective cohort study Concurrent cohort study Concurrent prospective study
Retrospective cohort study Historical cohort study Nonconcurrent prospective study
Randomized trial Experimental study
Cross-sectional study Prevalence survey
4Review of Cohort studies
5Design of cohort study (1)
6Design of cohort study (2)
7Prospective vs Retrospective Cohort
- Prospective study
- Identify a population and follow them
prospectively until events develop - Concurrent cohort
- Longitudinal study
8Cohort study prospective design
Pitfalls of the study Loss of subjects Loss of
investigators Lifestyle changes in the subjects
9Prospective vs Retrospective Cohort
- Retrospective study
- Identify a population and observe the events as
they occur and retrospectively determine their
exposure status from historical records - Non-current prospective study
- Historical cohort study
10Cohort study retrospective study
Pitfalls of the study Availability of
records Quality of records Recall bias
11Prospective and retrospective studies
- The designs of both prospective and retrospective
study are similar - Exposed and unexposed population are compared for
the events - Difference in time frame
- Prospective study forward time frame
- Retrospective study historical records for
similar period of time as prospective study
12Prospective and retrospective studies
13Potential biases in cohort studies
- Bias in assessment of the outcome
- Information on exposure status biases outcome
status - Information bias
- Difference in available information for the
exposed and unexposed - Biases from non-response and losses to follow-up
- Attrition rate creates study power problems
- Analytic biases
- Subjectivity at the time of analyses
14Table 82. Comparison of the Attributes of Retrospective and Prospective Cohort Studies.
Attribute Retrospective Approach Prospective Approach
Information Less complete and accurate More complete and accurate
Discontinued exposures Useful Not useful
Emerging new exposures Not useful Useful
Expense Less costly More costly
Completion time Shorter Longer
15Advantages and Disadvantages of Cohort Studies.
Advantages Disadvantages
Direct calculation of risk ratio (relative risk) Time consuming
May yield information on the incidence of disease Often requires a large sample size
Clear temporal relationship between exposure and disease Expensive
Particularly efficient for study of rare exposures Not efficient for the study of rare diseases
Can yield information on multiple exposures Losses to follow-up may diminish validity
Can yield information on multiple outcomes of a particular exposure Changes over time in diagnostic methods may lead to biased results
Minimizes bias
Strongest observational design for establishing cause and effect relationship
16Review of Odds Ratios (Case-Control Study)
Cases Controls
Exposed 6 3
Nonexposed 4 7
10 10
Compute odds ratio of this dataset
17Case control study of 10 unmatched subjects
summary
Figure 11-8 A case-control study of 10 cases and 10 unmatched controls.
Cases Controls
Exposed 6 3
Nonexposed 4 7
10 10
18But What if Data is Matched?
19Matched case control study
- Cases are matched with the controls on specific
variables - Cases and controls are analyzed in pairs rather
than individual subjects
1. Pairs in which both the case and the
controls were exposed 2. Pairs in which neither
the case nor the control was exposed 3. Pairs
in which the case was exposed but not the
control 4. Pairs in which the control was exposed
and not the case
Concordant pairs Discordant pairs
20Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Eg, outcome getting the runs (the cases) vs
not getting the runs (controls) exposure
did you attend the picnic and eat the egg salad?
confounder lactose intolerance (?)
21Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Assume this is an unmatched study. How does the
contingency table look?
22Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Outcome (cases) No outcome (controls) Totals
Exposed (picnic) 2 3 5
Not exposed (no picnic) 2 1 3
Totals 4 4 8
Odds ratio?
(2x1)/(3x2) 0.33
23Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Now lets organize the data considering that its
a matched study
24Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Controls Controls
Exposed Not Exposed
Cases Exposed
Cases Not Exposed
25Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Controls Controls
Exposed Not Exposed
Cases Exposed 1 1
Cases Not Exposed 2 0
26Concordant and discordant pairs
Control Control
Exposed Not Exposed
Case Exposed a b
Case Not Exposed c d
a pairs both case and the control were
exposed b pairs case was exposed but not the
control c pairs case was not exposed but the
control is exposed d pairs neither case nor
control was exposed a and d pairs are concordant
pairs b and c pairs are discordant pairs
27Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Controls Controls
Exposed Not Exposed
Cases Exposed 1 1
Cases Not Exposed 2 0
concordant
28Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Controls Controls
Exposed Not Exposed
Cases Exposed 1 1
Cases Not Exposed 2 0
concordant
discordant
29Individual matching (11)
- Echovirus meningitis outbreak, Germany, 2001
- Was swimming in pond A risk factor?
- Case control study with each case matched to one
control
Concordant pairs
Source A Hauri, RKI Berlin
30Odds ratio for matched pairs
- Odds ratio for matched pairs is
- The ratio of the ratio of the discordant pairs
- The ratio of the number of pairs in which the
case was exposed and the control was not, to the
number of pairs in which the control was exposed
and the case was not exposed - b / c
- The ratio of the number of pairs that support the
hypothesis of an association to the number of
pairs that negate the hypothesis of an association
?
31Matched cases and controls 2 x 2 table
Control Control
Exposed Not Exposed
Case Exposed 2 4
Case Not Exposed 1 3
Concordant pairs 2 pairs (exposed and exposed)
and 3 pairs (not exposed and not
exposed) Discordant pairs 4 pairs (exposed and
not exposed and 1 pair (not exposed and
exposed) Odds ratio b / c 4 / 1 4
32Picnic Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Matched odds ratio
b/c 1/2 0.5
Controls Controls
Exposed Not Exposed
Cases Exposed 1 1
Cases Not Exposed 2 0
33Remember this example?
Concordant pairs
Source A Hauri, RKI Berlin
34Individual matching (11) Analysis
Echovirus meningitis outbreak, Germany, 2001 Was
swimming in pond A risk factor? Case control
study with each case matched to one control
35What Else Can We Do With These Data?
- Remember the Chi-Square test?
36Chi-square
- Chi square is a non-parametric test of
statistical significance for bivariate tabular
analysis - It lets you know the degree of confidence you can
have in accepting or rejecting an hypothesis - It provides information on whether or not two
different samples are different enough in some
characteristic or aspect of their behaviour
37Chi Square
- There are actually all sorts of chi-square tests
out there - Pearsons
- Yates
- Mantel-Haenszel
- Portmanteau
- Fishers Exact
lt- Well be using this one
38Also need to compute something called degrees of
freedom
39Chi-square calculation
Variable 1
Variable 2 Data type 1 Data type 2 Totals
Category 1 a b a b
Category 2 c d c d
Total a c b d a b c d N
Chi square N(ad-bc)2 / (ab) (cd) (bd) (ac)
The degrees of freedom (number of columns minus
one) x (number of rows minus one) not counting
the totals for rows or columns. For our data
this gives (2-1) x (2-1) 1.
40Chi-square calculations
Number of animals that survived the treatment
Dead Alive Total
Treated 36 14 50
Not treated 30 25 55
Total 66 39 105
(36x25)/(14x30) 2.14
Odds ratio
Chi square
105(36)(25) - (14)(30)2 / (50)(55)(39)(66)
3.418
(2-1)x(2-1) 1
DOF
Now what do we do with this?
41Degrees of freedom and chi square table
Df 0.5 0.10 0.05 0.02 0.01 0.001
1 0.455 2.706 3.841 5.412 6.635 10.827
2 1.386 4.605 5.991 7.824 9.210 13.815
3 2.366 6.251 7.815 9.837 11.345 16.268
4 3.357 7.779 9.488 11.668 13.277 18.465
5 4.351 9.236 11.070 13.388 15.086 20.517
Using the Chi square table The corresponding
probability is 0.10ltPlt0.05. This is below the
conventionally accepted significance level of
0.05 or 5, so the null hypothesis that the two
distributions are the same is verified. In other
words, when the computed x2 statistic exceeds the
critical value in the table for a 0.05
probability level, then we can reject the null
hypothesis of equal distributions. Since our x2
statistic (3.418) did not exceed the critical
value for 0.05 probability level (3.841) we can
accept the null hypothesis that the survival of
the animals is independent of drug treatment
42p-value
- The p-value is the probability that your sample
could have been drawn from the population being
tested given the assumption that the null
hypothesis is true. - A p-value of .05, for example, indicates that you
would have only a 5 chance of drawing the sample
being tested if the null hypothesis was actually
true. - A p-value close to zero signals that your null
hypothesis is false, and typically that a
difference is very likely to exist. - Large p-values closer to 1 imply that there is no
detectable difference for the sample size used. - A p-value of 0.05 is a typical threshold used to
evaluate the null hypothesis.
43p-value
- So what does a p-value of 0.10 mean?
- We fail to reject the null hypothesis
44What is the null hyp that we are testing?
- In cohort studies, the chi-square test tells us
whether to accept or reject the null hypothesis
that RR1 - In case-control studies, the chi-square test
tells us whether or accept or reject the null
hypothesis that OR1 - Piersons chi-square is NOT appropriate to test
the null hypothesis of whether the matched study
pairs are related - For that we use something called McNemars test,
which we will not cover in this class
45Remember the Picnic Data
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Matched odds ratio
b/c 1/2 0.5
Controls Controls
Exposed Not Exposed
Cases Exposed 1 1
Cases Not Exposed 2 0
Pretend its unmatched and construct the
contingency table
46Data
Can you compute a chi-square for this?
Pair Outcome Yes (cases) Outcome No (controls)
1 2 3 4 Exposed Not exposed Not exposed Exposed Exposed Exposed Exposed Not exposed
Outcome (cases) No outcome (controls) Totals
Exposed (picnic) 2 3 5
Not exposed (no picnic) 2 1 3
Totals 4 4 8
Odds ratio?
(2x1)/(3x2) 0.33
47Caveat to Piersons Chi Square
- Typically, does not work well if any cell has a
count of lt5 - If it does, better off using Fishers Exact Test
or some other similar test - We will not be doing that in this class
48Summary
Chi square (ad-bc)2 (abcd) / (ab) (cd)
(bd) (ac)
The degrees of freedom equal (number of columns
minus one) x (number of rows minus one) not
counting the totals for rows or columns.
49If youre lazy
- Lots of online OR, RR and chi-square calculators
- Eg,
- http//faculty.vassar.edu/lowry/odds2x2.html
50Homework
12 women with uterine cancer and 12 without were
asked if theyd ever used supplemental estrogen.
Each woman with cancer was matched by race,
weight and parity to a woman without cancer
pair Women with cancer Women without cancel
1 2 3 4 5 6 7 8 9 10 11 12 Estrogen user Estrogen nonuser Estrogen user Estrogen user Estrogen user Estrogen nonuser Estrogen user Estrogen user Estrogen nonuser Estrogen nonuser Estrogen user Estrogen user Estrogen nonuser Estrogen nonuser Estrogen user Estrogen user Estrogen nonuser Estrogen nonuser Estrogen nonuser Estrogen nonuser Estrogen user Estrogen user Estrogen nonuser Estrogen nonuser
51Homework
- What is the estimated relative risk of cancer
when analyzing this study as a matched-pairs
study? - What is the estimated relative risk of cancer
when analyzing this study as an unmatched study? - What is the chi square statistic of the
(unmatched) relationship between cancer and
estrogen intake? - What is the null hypothesis being tested by the
chi-square test? - What does the p-value of the statistic tell you
about whether to reject or accept the null
hypothesis?