Title: Biostatistics Research Outside of Academia
1Biostatistics Research Outside of Academia
- The experience of one alum at the National
Institutes of Health - Stuart G. Baker, Sc.D.
2Background
- Harvard-Biostatistics (1980-84 post-doc 1985)
- Special thanks to Steve Lagakos, Christine
Waternaux, Milt Weinstein, Nan Laird, and Marvin
Zelen - NIH-Cancer Prevention-Biometry (1985-present)
- Special thanks to Peter Greenwald, David Byar,
Laurence Freedman, and Philip Prorok
3Unusual co-authors
- My wife (Karen Lindeman, M.D., Dept of
Anesthesiology, Johns Hopkins) - Members of my wifes department
- My elementary school best friend who I had not
seen in 20 years until the ENAR meeting in
Memphis (Paul Pinsky, Ph.D. now at NCI) - The editor of the Journal of the National Cancer
Institute (Barry Kramer, M.D.) - Three Nordic researcher pen pals
4Research projects
5Supporting general initiatives
6Figuring out the question
- Need to make sense of all the anticipated data
from early detection biomarkers - Generally agreed that lots of computing will be
neededNASA called in - Computers are useless. They only give answers
Pablo Picasso
7 Key question Which biomarkers, if any, are
promising for further study as triggers for early
intervention ?
- Baker(1998) Baker,Srivastava,Kramer(2002)
- Study design
- cohort study with stored specimens
- test for marker in all cases and some controls
- sample size
- Estimation
- false and true positive rates (target values)
- avoid overfitting
8Harried Sorter and the Curse of Dimensionality
Controls False Positive Rate
Cancers True positive rate
Ratio true/false positive rate
What regions (of A and B) optimize ROC curve?
No need to check 29 regions
Select regions by ratio of true to false positive
rate
9Harried Sorter and the Curse of Dimensionality
Controls False Positive Rate
Cancers True positive rate
Ratio true/false positive rate
Positive if A3, B3
10Harried Sorter and the Curse of Dimensionality
Controls False Positive Rate
Cancers True positive rate
Ratio true/false positive rate
Positive if A3, B3
or if A3, B2
11Harried Sorter and the Curse of Dimensionality
Controls False Positive Rate
Cancers True positive rate
Ratio true/false positive rate
Positive if A3, B3
or if A3, B2
or if A2, B2
12Harried Sorter and the Curse of Dimensionality
Controls False Positive Rate
Cancers True positive rate
Ratio true/false positive rate
Positive if A3, B3
or if A3, B2
or if A2, B2
or if A3, B2
or if A2, B3
13Clarifying the issues
- Validation of surrogate endpoints
- Muddled clinical view
- Validation is one of those words ... that is
constantly used and seldom defined - Alvin
Feinstein - Confusing statistics literature
- Day Duffy / Begg Leung paradoxes
- Prentice criterion for estimation
- Baker, Izmirlian, and Kipnis (submitted)
14Again, need the question
- Question Is inference about treatment effect
likely to be the same when using a potential
surrogate endpoint as when using a true endpoint? - A correlate does not a surrogate make Fleming
and DeMets (1996) - A perfect correlate does not a surrogate make
Baker and Kramer (2003) - Graphical clarification
15Perfect correlation (lines)
Prentice Criterion (lines coincide)
Yields valid hypothesis testing
True
Study group
Control group
control
study
Surrogate
control group
Study group
16Perfect correlation, No Prentice Criterion
(lines differ)
Hypothesis testing Not valid
Valid estimation (if estimate lines from previous
study)
True
Study group
Control group
study
control
Surrogate
control
study
17Cancer screening-not your usual randomized trial
Simple adjustment for dilution after screening
stopped (Baker, et al. 2003)
5 year follow-up
10 year follow-up
15 year follow-up
adaptive
0
40
Reduction in breast cancer mortality per 10,000
due to receipt of screening data from HIP trial
18Thinking out of the box
19Missing, missing everywhere, with finesse to
sparepotential outcomes
- Baker and Lindeman (1994)
- paired availability design for combining data
from multiple before-and-after studies - effect of epidural analgesia on probability of
Cesarean section - before period epidural less available
- after period epidural more available
20Thought experiment
21Thought experiment
22Thought experiment
23Thought experiment
- Estimate effect of receipt of epidural in NE
- Similar model later independently proposed by
Angrist, Imbens, Rubin (1996) randomized trials
24 NN NE
before
no epidural
pr(NN)
pr(NE)
epidural
EE
pr(EE)
no epidural
after
NN
pr(NN)
NE EE
epidural
pr(NE)
pr(EE)
pr( epidural after)
pr( epidural before)
-
pr(NE)
25 NN NE
before
no epidural
pr(NN)
pr(NE)
epidural
EE
pr(EE)
no epidural
after
NN
pr(NN)
NE EE
epidural
pr(NE)
pr(EE)
26pr( Cesarean section (CS) before)
NN NE
X
before
no epidural
Pr(CS NN, no epidural )
pr(NN)
pr(CS NE, no epidural)
X
pr(NE)
epidural
pr(CS EE, epidural)
X
EE
pr(EE)
pr(Cesarean-section (CS) after)
no epidural
after
NN
pr(NN)
X
pr(CS NN, no epidural)
NE EE
epidural
pr(CS NE, epidural)
pr(NE)
X
pr(CS EE, epidural)
pr(EE)
X
-
pr(CS after)
pr(CS before)
-
pr(CSNE,no epid)
pr(CSNE,epid)
-
pr(epidafter)
pr(epidbefore)
27Applications of potential outcomes
- Paired availability design
- results compared with randomized trials,
multivariate adjustment for observational data
(Baker and Lindeman 2000) - cancer screening (Baker et al., in press)
- Non-compliance and survival
- screening trial with refusers (Baker 1998)
- Non-compliance and auxiliary variable
- missing by design (no biopsy if low PSA) (Baker
2000)
28Computational Necessity
29Regression models with missing categorical data
- Computations to extend thesis to discrete
survival, diagnostic testing, longitudinal data
. case-control with haplotypes - User specifies matrices and link functions
- Program generates EM, Newton-Raphson
- To paraphrase the maxim that amateurs discuss
strategy and professional discuss logistics,
users discuss models and developers discuss
computation from Baker (1994)
30Generality leads to simplicity
- How to maximize a multinomial likelihood with
parameters that are ratios of functions to
summations of functions? - Poisson likelihood with an extra parameters for
each multinomial constraint - MP (Multinomial-Poisson) transformation
- Dont prove anything unless you know it is true -
Steve Piantadosi, Johns Hopkins
31Perfect-fit paradigm
- Models for missing, survival, non-compliance
multinomial data - Goal Minimum assumptions
- Perfect-fit closed-form solutions
- Simple asymptotic variance
- Delta method M-P transformation
- symbolically computed derivatives
32Computational Serendipity
33Models to evaluate cancer screening
- Trying to develop a model with as few underlying
parameters as possible - It makes no sense to convey a beguiling sense of
reality with irrelevant detail, when other
equally important factors can only be guessed at
Robert May - Weak links of cancer screening models
- effect of screening on cancer mortality
- self-selection bias
34Simulation Surprise
- Rate of cancer detection in absence of screening
is identifiable using only data from subjects
screened - Reduces self-selection bias (no birth cohort
effect, progressive detection) - Baker and Chu (1990) and simplified in
Baker(1998) Baker,et al.(2003) - Make everything as simple as possible and not
simpler -Albert Einstein
35Estimating detection rate if no screening using
only data from subjects who were screened
If first screen at age 50
age
age
If first screen at age 51
50
51
F50
N50
I50
F51
S50
F50 pr(detected first screen age 50)
pre-clinical phase
I50 pr(detected in interval age 50)
S50 pr(detected second screen after age 50)
F51 pr(detected first screen age 51)
N50 pr(detected no screen age 50)
KEY EQUATION N50 F50 I50 S50 F51
36Analyzing data
37Observer agreement with replicates
- Baker, Freedman, Parmar (1993)
- Goal estimate within- and between- observer
variation in pathology classification - Data repeated classifications by pathologists
- Method novel latent class model
38Double sampling survival data
- Baker, Wax, and Patterson (1993)
- Goal Estimate the effect of a drain on the
probability of a wound infection - Data
- Full follow-up after hospital discharge
- Partial follow-up censored at discharge
- Model informative censoring
39Non-ignorable missing survey data
- Baker, Ko, and Graubard (2002)
- Data large health survey
- Goal Estimate the effect of balance on the
probability of depression - Method
- Missing in depression depends on depression and
covariates - Matrix variance formula with complex survey
design
40Improving understanding
41Seeing is understanding
- Simpsons Paradox Good for men, good for women,
bad for people Tom Louis - BK-Plot (Baker, Kramer 2002) but independently
developed earlier - Stiglers Law of Eponymy no invention or
discovery is ever named after the right person -
Howard Wainer (Chance magazine), who coined the
name BK-Plot
42Good for Boys, Good for Girls, Bad for Children
(for 3rd grade math class)
43BK-Plot
Fraction that grow a lot
9/10 if milk
girls
8/10 if no milk
If no milk
6/10
both
5/10
If milk
If milk
3/10
boys
2/10
If no milk
2/3
1/3
1
0
both
milk
No milk
Fraction that are girls
44A casual remark on causality
- A bests B in one randomized trial
- B bests C in another randomized trial
- Does A best C?
- Apply B-K plot (Baker and Kramer, 2003)
- It's odd that logical acuity often reveals
hidden ambiguities" John Allen Paulos,
mathematician
45Two Randomized Trials
One Randomized Trial
A bests B
C bests A
B bests C
Benefit
Benefit
B
B
A
C
C
B
A
30
10
10
Percent with confounder
Percent with confounder
46Meta-analysis - binary outcomes
- Compare
- risk difference
- relative risk
- odds ratio
- When the distribution of an unobserved binary
variable (with no treatment interaction) differs
across studies
47Risk Difference
Constant
risk difference
probability
0
0
1
1
Fraction with unobserved binary variable
Fraction with unobserved binary variable
48Relative Risk
Constant
relative risk
probability
0
0
1
1
Fraction with unobserved binary variable
Fraction with unobserved binary variable
49Odds Ratio
NOT Constant
Odds ratio
probability
0
0
1
1
Fraction with unobserved binary variable
Fraction with unobserved binary variable
50Missing binary outcomes in a randomized trial
- Baker and Freedman (2003)
- Missing due to an unobserved binary covariate
with no interaction with treatment - Sensitivity analysis
- one parameter instead of the usual two
- explicit use of randomization
51Helping out
52The FDA comes to NIH
- Design a study to compare performance of digital
versus analog mammography - Baker et al. (1998) Baker Pinsky (2001)
- Modified paired design
- if analog negative, only a random sample get
digital (reduces costs) - partial area under ROC curves
53Inspiration from an unlikely source protocol
review
- Estimate effect of tamoxifen on breast cancer in
gene carriers (randomized trial) - Protocol
- case-only design (observational study)
- estimate RR
- Baker and Kramer (submitted)
- nested-case control design
- estimate risk difference by age
54Its not what the investigators want--its why
they cant get it
- Request
- Identify high-risk subjects for cancer prevention
trial - increase power
- But more complicated
- Power increases if RR same (low high risk)
- Power decreases if risk difference same
- Baker, Kramer, Corle (submitted)
55A simple request that led to an international
collaboration
- Report on twin study
- Lichtenstein et al NEJM, 2000
- genetics are a minor component of cancer
- Latent class model (based on genetics)
- Age-specific data from Lichtenstein et al
- Results agreed with Lichtenstein et al.
- Baker, Lichtenstein, Kaprio, Holm (in press)
56Final quotes
57Take-home messages
- Take chances, get messy, make mistakes -The Magic
School Bus, television show - Mathematics is the language of precise thinking
R. Hamming, mathematician - In my experience, any really groundbreaking paper
had difficulty being accepted, while the mundane
sailed through -Elias Zerhouni, NIH Director