Title: Using Genomics in Clinical Trial Design
1Using Genomics in Clinical Trial Design
- Richard Simon, D.Sc.
- Chief, Biometric Research Branch
- National Cancer Institute
- http//linus.nci.nih.gov
2BRB Websitehttp//linus.nci.nih.gov/brb
- Powerpoint presentations and audio files
- Reprints Technical Reports
- BRB-ArrayTools software
- BRB-ArrayTools Data Archive
- Sample Size Planning for Targeted Clinical Trials
3- Many cancer treatments benefit only a small
proportion of the patients to which they are
administered - Targeting treatment to the right patients can
greatly improve the therapeutic ratio of benefit
to adverse effects - Treated patients benefit
- Treatment more cost-effective for society
4(No Transcript)
5Genomic Targeting
- Enables patients to be treated with drugs that
actually work for them - Avoids false negative trials for heterogeneous
populations - Avoids erroneous generalizations of conclusions
from positive trials
6Biomarkers
- Surrogate endpoints
- A measurement made before and after treatment to
determine whether the treatment is working - Predictive classifiers
- A measurement made before treatment to select
good patient candidates for the treatment
7ValidationFit for Purpose
- FDA terminology of valid biomarker and
probable valid biomarker are not applicable to
predictive classifiers - Validation has meaning only as fitness for
purpose and the purpose of predictive classifiers
are completely different than for surrogate
endpoints
8- The purpose of a multi-gene predictive classifier
is to predict - It is often much easier to develop an accurate
predictive classifier than to elucidate the role
of the component genes in disease biology
9New Drug Developmental Strategy (I)
- Develop a diagnostic classifier that identifies
the patients likely to benefit from the new drug - Develop a reproducible assay for the classifier
- Use the diagnostic to restrict eligibility to a
prospectively planned evaluation of the new drug - Demonstrate that the new drug is effective in the
prospectively defined set of patients determined
by the diagnostic
10Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
11Applicability of Design I
- Primarily for settings where the classifier is
based on a single gene whose protein product is a
target of the drug - Herceptin
- With substantial biological basis for the
classifier, it will often be unacceptable
ethically to expose classifier negative patients
to the new drug
12We dont think that this drug will help you
because your tumor is test negative. But we need
to show the FDA that a drug we dont think will
help test negative patients actually doesnt
13Evaluating the Efficiency of Strategy (I)
- Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004. - Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005. - reprints and interactive sample size calculations
at http//linus.nci.nih.gov/brb
14(No Transcript)
15(No Transcript)
16Two Clinical Trial Designs
- Un-targeted design
- Randomized comparison of T to C without screening
for expression of molecular target - Targeted design
- Assay patients for expression of target
- Randomize only patients expressing target
17- Efficiency relative to trial of unselected
patients depends on proportion of patients test
positive, and effectiveness of drug (compared to
control) for test negative patients - When less than half of patients are test positive
and the drug has little or no benefit for test
negative patients, the targeted design requires
dramatically fewer randomized patients
18No treatment Benefit for Assay - Patientsnstd /
ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.78 1.33
0.5 4 2
0.25 16 4
19Treatment Benefit for Assay Pts Half that of
Assay Pts nstd / ntargeted
Proportion Assay Positive Randomized Screened
0.75 1.31 0.98
0.5 1.78 0.89
0.25 2.56 0.64
20Trastuzumab
- Metastatic breast cancer
- 234 randomized patients per arm
- 90 power for 13.5 improvement in 1-year
survival over 67 baseline at 2-sided .05 level - If benefit were limited to the 25 assay
patients, overall improvement in survival would
have been 3.375 - 4025 patients/arm would have been required
- If assay patients benefited half as much, 627
patients per arm would have been required
21Comparison of Targeted to Untargeted DesignSimon
R, Development and Validation of Biomarker
Classifiers for Treatment Selection, JSPI
Treatment Hazard Ratio for Marker Positive Patients Number of Events for Targeted Design Number of Events for Traditional Design Number of Events for Traditional Design Number of Events for Traditional Design
Percent of Patients Marker Positive Percent of Patients Marker Positive Percent of Patients Marker Positive
20 33 50
0.5 74 2040 720 316
22Randomized Ratiosensitivityspecificity0.9
Express target ?00 ?0 ?1/2
0.75 1.29 1.26
0.5 1.8 1.6
0.25 3.0 1.96
0.1 25.0 1.86
23Screened Ratiosensitivityspecificity0.9
Express target ?00 ?0 ?1/2
0.75 0.9 0.88
0.5 0.9 0.80
0.25 0.9 0.59
0.1 4.5 0.33
24Web Based Software for Comparing Sample Size
Requirements
- http//linus.nci.nih.gov/brb/
-
25Developmental Strategy (II)
26Developmental Strategy (II)
- Do not use the diagnostic to restrict
eligibility, but to structure a prospective
analysis plan - Having a prospective analysis plan is essential
stratifying (balancing) the randomization is
not except that stratification ensures that all
randomized patients will have tissue available - The purpose of the study is to evaluate the new
treatment overall and for the pre-defined
subsets not to modify or refine the classifier - The purpose is not to demonstrate that repeating
the classifier development process on independent
data results in the same classifier
27Analysis Plan A
- Compare the new drug to the control for
classifier positive patients - If pgt0.05 make no claim of effectiveness
- If p? 0.05 claim effectiveness for the
classifier positive patients and - Compare new drug to control for classifier
negative patients using 0.05 threshold of
significance
28Sample size for Analysis Plan A
- 88 events in classifier patients needed to
detect 50 reduction in hazard at 5 two-sided
significance level with 90 power - If test is predictive but not prognostic, and if
25 of patients are positive, then when there are
88 events in positive patients there will be
about 264 events in negative patients - 264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level - Sequential futility monitoring may have enabled
early cessation of accrual of classifier negative
patients - Not much earlier with time-to-event endpoint
29- Study-wise false positivity rate is limited to 5
with analysis plan A - It is not necessary or appropriate to require
that the treatment vs control difference be
significant overall before doing the analysis
within subsets
30Analysis Plan B
- Compare the new drug to the control overall for
all patients ignoring the classifier. - If poverall? 0.03 claim effectiveness for the
eligible population as a whole - Otherwise perform a single subset analysis
evaluating the new drug in the classifier
patients - If psubset? 0.02 claim effectiveness for the
classifier patients.
31- This analysis strategy is designed to not
penalize sponsors for having developed a
classifier - It provides sponsors with an incentive to develop
genomic classifiers
32Sample size for Analysis Plan B
- To have 90 power for detecting uniform 33
reduction in overally hazard at 3 two-sided
level requires 297 events (instead of 263 for
similar power at 5 level) - If test is predictive but not prognostic, and if
25 of patients are positive, then when there are
297 total events there will be approximately 75
events in positive patients - 75 events provides 75 power for detecting 50
reduction in hazard at 2 two-sided significance
level - By delaying evaluation in test positive patients,
80 power is achieved with 84 events and 90
power with 109 events
33Song Chi Refinement of Testing Procedure for
Plan B
- Specify ?1 lt ? lt ?1
- e.g. ?.025, ?1.02, ?1.10
- calculate ? .013
- Reject overall null hypothesis if
- Poverall ?1 or
- P ? and Poverall ?
- Reject null hypothesis in test positive subset if
- P ? and Poverall ?1
- e.g. ?.025, ?1.02, ?1.10, ? .013
34Adaptively Modifying the Types of Patients
AccruedWang, ONeill, Hung
- Plan RCT to accrue N total patients
- Interim futility analysis of test negative
patients - If accrual of test negative patients is
terminated, replace them with test positive
patients to achieve the planned total N - May prolong duration of trial substantially
- Futility for test negative patients declared only
if efficacy for control group is superior to
treatment group by specified amount - Limited opportunity to reduce number of test
negative patients
35Analysis Plan C
- Test for interaction between treatment effect in
test positive patients and treatment effect in
test negative patients - If interaction is significant at level ?int then
compare treatments separately for test positive
patients and test negative patients - Otherwise, compare treatments overall
36Sample Size Planning for Analysis Plan C
- 88 events in classifier patients needed to
detect 50 reduction in hazard at 5 two-sided
significance level with 90 power - If test is predictive but not prognostic, and if
25 of patients are positive, then when there are
88 events in positive patients there will be
about 264 events in negative patients - 264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level
37Simulation Results for Analysis Plan C
- Using ?int0.10, the interaction test has power
93.7 when there is a 50 reduction in hazard in
test positive patients and no treatment effect in
test negative patients - A significant interaction and significant
treatment effect in test positive patients is
obtained in 88 of cases under the above
conditions - If the treatment reduces hazard by 33 uniformly,
the interaction test is negative and the overall
test is significant in 87 of cases
38The Roadmap
- Develop a completely specified genomic classifier
of the patients likely to benefit from a new drug - Establish reproducibility of measurement of the
classifier - Use the completely specified classifier to design
and analyze a new clinical trial to evaluate
effectiveness of the new treatment with a
pre-defined analysis plan.
39Guiding Principle
- The data used to develop the classifier must be
distinct from the data used to test hypotheses
about treatment effect in subsets determined by
the classifier - Developmental studies are exploratory
- And not closely regulated by FDA
- Studies on which treatment effectiveness claims
are to be based should be definitive studies that
test a treatment hypothesis in a patient
population completely pre-specified by the
classifier
40Use of Archived Samples
- From a non-targeted negative clinical trial to
develop a binary classifier of a subset thought
to benefit from treatment - Test that subset hypothesis in a separate
clinical trial - Prospective targeted type I trial
- Using archived specimens from a second previously
conducted clinical trial
41Development of Genomic Classifiers
- Single gene or protein based on knowledge of
therapeutic target - Empirically determined based on evaluation of a
set of candidate classifiers - e.g. EGFR assays
- Empirically determined based on genome-wide
correlating gene expression or genotype to
patient outcome after treatment
42Development of Genomic Classifiers
- During phase II development or
- After failed phase III trial using archived
specimens. - Adaptively during early portion of phase III
trial.
43Biomarker Adaptive Threshold Design
- Wenyu Jiang, Boris Freidlin Richard Simon
- JNCI 991036-43, 2007
44Biomarker Adaptive Threshold Design
- Randomized phase III trial comparing new
treatment E to control C - Survival or DFS endpoint
45Biomarker Adaptive Threshold Design
- Have identified a predictive index B thought to
be predictive of patients likely to benefit from
E relative to C - Eligibility not restricted by biomarker
- No threshold for biomarker determined
46Analysis Plan
- S(b)log likelihood ratio statistic for treatment
versus control comparison in subset of patients
with B?b - Compute S(b) for all possible threshold values
- Determine TmaxS(b)
- Compute null distribution of T by permuting
treatment labels - Permute the labels of which patients are in which
treatment group - Re-analyze to determine T for permuted data
- Repeat for 10,000 permutations
47- If the data value of T is significant at 0.05
level using the permutation null distribution of
T, then reject null hypothesis that E is
ineffective - Compute point and bootstrap confidence interval
estimates of the threshold b
48(No Transcript)
49Model Hazard reduction for those who benefit Overall Power Adaptive Test
Everyone benefits 33 .775 .751
50 benefit 60 .888 .932
25 benefit 60 .429 .604
50Adaptive Biomarker Threshold Design
- Sample size planning methods described by Jiang,
Freidlin and Simon, JNCI 991036-43, 2007
51Adaptive Signature Design An adaptive design for
generating and prospectively testing a gene
expression signature for sensitive patients
- Boris Freidlin and Richard Simon
- Clinical Cancer Research 117872-8, 2005
52Adaptive Signature DesignEnd of Trial Analysis
- Compare E to C for all patients at significance
level 0.03 - If overall H0 is rejected, then claim
effectiveness of E for eligible patients - Otherwise
53- Otherwise
- Using only the first half of patients accrued
during the trial, develop a binary classifier
that predicts the subset of patients most likely
to benefit from the new treatment E compared to
control C - Compare E to C for patients accrued in second
stage who are predicted responsive to E based on
classifier - Perform test at significance level 0.02
- If H0 is rejected, claim effectiveness of E for
subset defined by classifier
54Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
55Overall treatment effect, no subset effect.10
of patients sensitive, 10 sensitivity genes,
10,000 genes, 400 patients.
Test Power
Overall .05 level test 74.2
Overall .04 level test 70.9
Sensitive subset .01 level test 1.0
Overall adaptive signature design 70.9
56Conclusions
- New technology makes it increasingly feasible to
identify which patients are likely or unlikely to
benefit from a specified treatment - Targeting treatment can greatly improve the
therapeutic ratio of benefit to adverse effects
57Conclusions
- Some of the conventional wisdom about how to
develop predictive classifiers and how to use
them in clinical trial design and analysis is
flawed - Prospectively specified analysis plans for phase
III studies are essential to achieve reliable
results - Biomarker analysis does not mean exploratory
analysis except in developmental studies - Prospective analysis of previously conducted
trials can provide reliable conclusions
58Conclusions
- Achieving the potential of new technology
requires paradigm changes in correlative
science and in some aspects of design and
analysis of clinical trials
59Collaborators
- Boris Freidlin
- Aboubakar Maitournam
- Kevin Dobbin
- Wenu Jiang
- Yingdong Zhao
60Using Genomic Classifiers In Clinical Trials
- Dupuy A and Simon R. Critical review of published
microarray studies for clinical outcome and
guidelines for statistical analysis and
reporting, Journal of the National Cancer
Institute 99147-57, 2007 - .
- Dobbin K and Simon R. Sample size planning for
developing classifiers using high dimensional DNA
microarray data. Biostatistics 8101-117, 2007. - Dobbin K, Zhao Y and Simon R. How large a
training set is needed to develop a classifier
for microarray data? Clinical Cancer Research (In
Press). - Simon R. Development and validation of
therapeutically relevant predictive classifiers
using gene expression profiling, Journal of the
National Cancer Institute 981169-71, 2006. - Simon R. Validation of pharmacogenomic biomarker
classifiers for treatment selection. Cancer
Biomarkers 289-96, 2006. - Simon R. Guidelines for the design of clinical
studies for development and validation of
therapeutically relevant biomarkers and biomarker
classification systems. In Biomarkers in Breast
Cancer, Hayes DF and Gasparini G, pp 3-15, Humana
Press, 2006. - Simon R. A checklist for evaluating reports of
expression profiling for treatment selection.
Clinical Advances in Hematology and Oncology
4219-224, 2006. - Simon R. Identification of pharmacogenomic
biomarker classifiers in drug development. In
Pharmacogenomics, Anti-cancer Drug Discovery and
Response, F Innocenti (ed), Humana Press (In
Press). - Simon R. New challenges for 21st century clinical
trials, Controlled Clinical Trials 4167-169,
2007.
61Using Genomic Classifiers In Clinical Trials
- .
- Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004. - Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005. - Simon R. When is a genomic classifier ready for
prime time? Nature Clinical Practice Oncology
14-5, 2004. - Simon R. An agenda for Clinical Trials clinical
trials in the genomic era. Clinical Trials
1468-470, 2004. - Simon R. Development and validation of
therapeutically relevant multi-gene biomarker
classifiers. Journal of the National Cancer
Institute 97866-867, 2005.. - Simon R. A roadmap for developing and validating
therapeutically relevant genomic classifiers.
Journal of Clinical Oncology 237332-41,2005. - Freidlin B and Simon R. Adaptive signature
design. Clinical Cancer Research 117872-78,
2005. - Simon R. and Wang SJ. Use of genomic signatures
in therapeutics development in oncology and other
diseases, The Pharmacogenomics Journal 6166-73,
2006. - Trepicchio WL, Essayan D, Hall ST, Schechter G,
Tezak Z, Wang SJ, Weinreich D, Simon R. Designing
prospective clinical pharmacogenomic trials-
Effective use of genomic biomarkers for use in
clinical decision-making. The Pharmacogenomics
Journal 689-94,2006.
62Using Genomic Classifiers In Clinical Trials
- .
- Simon R Challenges of microarray data and the
evaluation of gene expression profile signatures.
Cancer Investigation (In Press) - Simon R. Lost in translation Problems and
pitfalls in translating laboratory observations
to clinical utility. European Journal of Cancer
(In Press)