Title: Observational Studies
1Observational Studies
Based on Rosenbaum (2002) David Madigan
Rosenbaum, P.R. (2002). Observational Studies
(2nd edition). Springer
2Introduction
- A empirical study in which
- Examples
- smoking and heart disease
- vitamin C and cancer survival
- DES and vaginal cancer
The objective is to elucidate cause-and-effect
relationships in which it is not feasible to use
controlled experimentation
- aspirin and mortality
- cocaine and birthweight
- diet and mortality
3Asthma Study
- Have data on 2,000 kids
- What is the effect of tobacco experimentation on
asthma?
4(No Transcript)
5Cameron and Pauling Vitamin C
- Gave Vitamin C to 100 terminally ill cancer
patients - For each patient found 10 controls matched for
age, gender, cancer site, and tumor type - Vitamin C patients survived four times longer
than controls - Later randomized study found no effect of vitamin
C - Turns out the control group was formed from
patients already dead
LESSONS - observational studies are tricky -
randomized study is the gold standard
why?
6Why does randomization work?
7(No Transcript)
8- The two groups are comparable at baseline
- Could do a better job manually matching patients
on 18 characteristics listed, but no guarantees
for other characteristics - Randomization did a good job without being told
what the 18 characteristics were - Chance assignment could create some imbalances
but the statistical methods account for this
properly
9The Hypothesis of No Treatment Effect
- In a randomized experiment, can test this
hypothesis essentially without making any
assumptions at all - no effect formally means for each patient the
outcome would have been the same regardless of
treatment assignment - Test statistic, e.g., proportion
(DTT)-proportion(DPCI)
PCI D
PCI D
TT L
TT L
TT D
TT D
PCI L
PCI L
TT D
PCI D
TT L
PCI L
TT D
PCI D
PCI L
TT L
PCI D
TT D
TT L
PCI L
PCI D
TT D
PCI L
TT L
P1/6
observed
10Estimates, etc.
- Note the probability distribution needed for the
test is known, not assumed or modeled - Randomized experiment provides unbiased estimator
of the average treatment effect - Internal versus external validity
- Confidence intervals by inverting tests
- Partially ordered outcomes, censoring,
multivariate outcomes, etc.
11Overt Bias in Observational Studies
- An observational study is biased if treatment
and control groups differ prior to treatment in
ways that matter for the outcome under study
Overt bias a bias that can be seen in the
data Hidden bias involves factors not in the
data Can adjust for overt bias
12Overt Bias
covariate vector
treatment (assume binary 0 or 1). pj Pr(Zj1)
unknown
An OS is free of hidden bias if the ?js are
known to depend only on the s (i.e.,
) (so two units with same x have same
prob of getting the treatment)
unknown
13Stratifying on x
- Suppose can group units into strata with
identical xs. Then - Conditional on all s are
equally likelyjust like in a uniform randomized
experiment
14Stratifying on the Propensity Score
- Obviously exact matching not always possible
- Idea form strata comprising units with the same
?s ( i.e. could have
) - Problem dont know the ?s
- Solution estimate them (logistic regression,
SVM, decision tree, etc.) - Form strata containing units with similar
probability of treatment
15Matched Analysis Using a model with 29
covariates to predict VHA use, we were able to
obtain an accuracy of 88 percent
(receiver-operating-characteristic curve, 0.88)
and to match 2265 (91.1 percent) of the VHA
patients to Medicare patients. Before matching,
16 of the 29 covariates had a standardized
difference larger than 10 percent, whereas after
matching, all standardized differences were less
than 5 percent
16Conclusions VHA patients had more coexisting
conditions than Medicare patients. Nevertheless,
we found no significant difference in mortality
between VHA and Medicare patients, a result that
suggests a similar quality of care for acute
myocardial infarction.
17(No Transcript)
18(No Transcript)
19What about hidden bias?
- Sensitivity analysis!
- Consider two units j and k with the same x.
hidden bias ? they may not have the same ? - Consider this inequality
- Sensitivity analysis will consider various ?s
20An equivalent latent variable model
- for two units j and k with the same x
between 1 and 1
so the model implies the previous inequality
with (implication goes the other way
too)
21Matched Pairs
- Strata of size 2, one gets the treatment, one
doesnt - If ?0, every unit has the same chance of
treatment - Standard test statistic for matched pairs is
Wilcoxon rank sum test
rank of
sum of the ranks for pairs in which treated unit
gt control unit
22More on Matched Pairs
- No hidden bias gt know the null distribution of T
because sth pair contributes ds with prob ½ and 0
with prob ½ - with hidden bias, the sth pair contributes ds
with prob - and zero with prob 1-ps
- so null distribution of T is unknown
23Even More on Matched Pairs
- The P-value we are after is
- Lower bound on P-value where
T- is the sum of S quantities, the sth one being
ds with prob and 0 otherwise - Upper bound likewise using
- This directly provides bounds on P-values for
fixed ?
24Smoking Lung Cancer Example
- Hammond (1964) paired 36,975 heavy smokers to
non-smokers. Matched on age, race, plus 16 other
factors
? Minimum Maximum
1 lt 0.0001 lt 0.0001
2 lt 0.0001 lt 0.0001
3 lt 0.0001 lt 0.0001
4 lt 0.0001 0.0036
5 lt 0.0001 0.03
6 lt 0.0001 0.1
25Asthma Study
- Need a ? of three to make the effect of tobacco
experimentation on asthma become non-significant