Title: Statistical challenges in the validation of surrogate endpoints
1Statistical challenges in the validation of
surrogate endpoints
Marc Buyse International Drug Development
Institute (IDDI), Brussels Limburgs Universitair
Centrum, Diepenbeek, Belgium marc.buyse_at_iddi.com
FDA Industry Workshop, September 22-23, 2004
2Outline
- Need for surrogates
- Definitions
- Validation criteria
- Single trial
- Several trials (meta-analysis)
- Case studies
- PSA and survival (advanced prostatic cancer)
- 3-year PFS and 3-year OS (early colorectal
cancer)
3Why do we need surrogates?
- Practicality of studies
- Shorter duration
- Smaller sample size (?)
- Availability of biomarkers
- Tissue, cellular, hormonal factors, etc.
- Imaging techniques
- Genomics, proteomics, other-ics
Ref Schatzkin and Gail, Nature Reviews (Cancer)
2001, 3.
4Validity of a surrogate endpoint
- Evidence that biomarkers predict clinical effects
- Epidemiological
- Pathophysiological
- Biological
- Statistical
What are the conditions required to show this?
Ref Biomarkers Definition Working Group, Clin
Pharmacol Ther 2001, 69 89.
5Definitions
- Clinical endpoint a characteristic or variable
that reflects how a patient feels, functions, or
survives - Biomarker a characteristic that is objectively
measured and evaluated as an indicator of normal
biological processes, pathogenic processes, or
pharmacologic responses to a therapeutic
intervention - Surrogate endpoint a biomarker that is intended
to substitute for a clinical endpoint. A
surrogate endpoint is expected to predict
clinical benefit (or harm or lack of benefit or
harm)
Ref Temple, JAMA 1999282790.
6Single trial
- Parameters of interest
- effect of treatment on surrogate endpoint (?)
- effect of treatment on true endpoint (?)
- effect of surrogate on true endpoint (?)
- adjusted effect of treatment on true endpoint
(?S) - adjusted effect of surrogate on true endpoint
(?Z) - Ref Buyse and Molenberghs, Biometrics
1998541014.
7Surrogateendpoint
Treatment
Trueendpoint
8Correlation of endpoints is not enough
- Key point A correlate does not a surrogate
make - ? ? ? 0 is not a sufficient condition for
validity - Ref Fleming and DeMets, Ann Intern Med 1996,
125 605.
9A first formal definition and criteria
- Prentices definition
- H0S ? 0 ? H0T ? 0
- Prentices criteria
- An endpoint can be used as a surrogate if
- it predicts the final endpoint (? ? 0)
- it fully captures the effect of treatment upon
the final endpoint (? ? 0 and ?S 0) - Ref Prentice, Statist in Med 19898431.
10A first formal definition and criteria
- Problems with Prentices approach
- rooted in hypothesis testing
- require significant treatment effects
- overly stringent
- criteria not equivalent to definition (except for
binary endpoints) - one can never prove the null (?S 0)
- Ref Buyse and Molenberghs, Biometrics
1998541014.
11The proportion explained
- Freedmans proportion explained is defined as
- PE 1 - ?S / ?
- if ?S ?, PE 0 and the surrogate explains
nothing - if ?S 0, PE 1 and the surrogate explains the
entire effect of treatment on the true endpoint - Ref Freedman et al, Statist in Med 19898431.
12The proportion explained
- Problems with the proportion explained
- PE is not a proportion (can be lt0 or gt1)
- PE confuses two sources of variability, one at
the individual level, the other at the trial
level - PE ?Z ?/?
- PE can be anywhere on the real line, depending on
precision of S and T - Ref Molenberghs et al, Controlled Clin Trials
200223607.
13Statistical validation of surrogate endpoints
- The effect of treatment on a surrogate endpoint
must be reasonably likely to predict clinical
benefit - Ref Biomarkers Definitions Working Group, Clin
Pharmacol Ther 20016989.
14The relative effect
- Interest now focuses on the two components of PE
- the surrogate must predict the true endpoint (?Z
? 0) - the relative effect, defined as
- RE ?/?
- allows prediction of the effect of treatment on
the true endpoint (?) based on the effect of
treatment on the surrogate (?) - Ref Buyse and Molenberghs, Biometrics
1998541014.
15Prediction of true endpoint from surrogate
endpoint
Endpoints observed on individual patients
R² indicates quality of regression
True Endpoint
Slope ?
Surrogate Endpoint
16Prediction of treatment effect one trial
Treatment effect observedin the trial
1
.5
Slope ?/?
Treatment Effect on True Endpoint (?)
0
Regression through origin only one point!
-.5
-1
-1
0
1
Treatment Effect on Surrogate Endpoint (?)
17Several trials
- For a marker to be used as a surrogate, we need
repeated demonstrations of a strong correlation
between the marker and the clinical outcome -
- Ref Holland, 9th EUFEPS Conference on
Optimising Drug Development Use of
Biomarkers, Basel, 2001.
18Prediction of treatment effect several trials
Treatment effects observedin all trials
1
.5
Slope ?/?
Treatment Effect on True Endpoint (?)
0
-.5
R² indicates quality of regression
-1
-1
0
1
Treatment Effect on Surrogate Endpoint (?)
19Validation criteria using several trials
- Parameters of interest
- effect of treatment on surrogate endpoint (?)
- effect of treatment on true endpoint (?)
- effect of surrogate on true endpoint (?)
- measure of association between surrogate endpoint
and true endpoint (R²individual) - measure of association between effects of
treatment on surrogate endpoint and on true
endpoint (R²trial) - Ref Buyse et al, Biostatistics 2000149
- Gail et al, Biostatistics 20001231.
20Technical difficulties the endpoints are not
normally distributed
- In practice, endpoints are often of the following
type response, survival, longitudinal. Such
endpoints are not normally distributed, and
therefore complex modelling is required to
characterize the association between endpoints
(individual level association). - At the trial level, however, simple linear models
are still adequate to characterize the
association between treatment effects on the
endpoints (trial level association). - Refs
- Molenberghs et al, Stat Med 203023, 2001
- Burzykowski et al, J Royal Stat Soc A 50 405,
2001 - Renard et al, J Applied Statist 30235, 2002.
21A case study in advanced prostatic cancerthe
trials
- Two multicentric trials for patients in relapse
after first-line endocrine therapy (596 patients) - Unit of analysis for treatment effects country
(19 units) - Patients randomized between two treatments
- Experimental (retinoic acid metabolism-blocking
agent) - Control (anti-androgen)
- Ref Buyse et al, in Biomarkers in Clinical Drug
Development (Bloom JC, ed.) Springer-Verlag,
2003.
22A case study in advanced prostatic cancerthe
endpoints
- Potential surrogate endpoints
- Longitudinal PSA measurements taken at
pre-defined time points - PSA response (decrease of at least 50)
- Time to PSA progression (TPP)
- True endpoint
- Overall survival
23A case study in advanced prostatic cancer
Surrogateendpoint
Treatment
Experimental
Rz
Control
Trueendpoint
24PSA response as surrogate for survival
Very weak association between treatment effects
R² 0.05
25TTP as surrogate for survival
Weak association between treatment effects
R² 0.22
26Longitudinal PSA as surrogate for survival
Moderate association between treatment effects
R²trial 0.45
27Individual-level and trial-level measures of
association
28A case study in early colorectal cancerthe
trials
- Fifteen collaborative group trials for patients
after resection of colorectal tumor (12,915
patients) - Unit of analysis for treatment effects 18
comparisons between 33 treatment arms - Patients randomized between various 5-FU regimens
and/or control
29A case study in early colorectal cancerthe
endpoints
- Potential surrogate endpoint
- 3-year disease-free survival
- True endpoint
- 5-year overall survival
- Ref Sargent et al, Proceedings ASCO (Abstract
3502), 2004. - Acknowledgement the following slides are based
on Dr Daniel Sargents presentations to ODAC on
May 5 and at ASCO on June 6
30Most recurrences occur before 3 years
31Strong association between endpoints
32Strong association between treatment effects
33Predicted versus actual OS hazard ratios
34Overview of validation approaches
- Single trial
- full capture (Prentice)
- proportion explained (Freedman et al)
- relative effect (Buyse Molenberghs)
- likelihood reduction factor (Alonso et al)
- Several trials (meta-analysis)
- concordance (Begg Leung)
- correlation of effects (Daniels Hughes)
- trial-level measures of association (Gail et al)
- individual- and trial-level measures of
association (Buyse et al) - predicted treatment effect (Baker)
- surrogate threshold effect (Burzykowski Buyse)
35Conclusions on surrogate validation
- Ideally, statistical validation requires the
following - data from randomized trials
- replication at the trial or center level
- at least some observations of T
- large numbers of observations
- range of therapeutic questions (Z1, Z2, )
- Hence
- individual patient data meta-analyses are needed
- access to such data is a problem when they are
proprietary - Ref Burzykowski, Molenberghs and Buyse (eds.),
The Evaluation of Surrogate Endpoints,
Springer-Verlag (in press).