Title: STATISTICS 542 Intro to Clinical Trials SURVIVAL ANALYSIS
1STATISTICS 542Intro to Clinical
TrialsSURVIVAL ANALYSIS
2Survival Analysis Terminology
- Concerned about time to some event
- Event is often death
- Event may also be, for example
- 1. Cause specific death
- 2. Non-fatal event or death,
- whichever comes first
- death or hospitalization
- death or MI
- death or tumor recurrence
3Survival Rates at Yearly Intervals
- YEARS
- At 5 years, survival rates the same
- Survival experience in Group A appears more
favorable, considering 1 year, 2 year, 3 year and
4 year rates together
4Beta-Blocker Heart Attack Trial LIFE-TABLE
CUMULATIVE MORTALITY CURVE
5Survival Analysis
- Discuss
- 1. Estimation of survival curves
- 2. Comparison of survival curves
- I. Estimation
- Simple Case
- All patients entered at the same time and
followed for the same length of time - Survival curve is estimated at various time
points by (number of deaths)/(number of patients) - As intervals become smaller and number of
patients larger, a "smooth" survival curve may be
plotted - Typical Clinical Trial Setting
6Staggered Entry
T years
1
T years
2
Subject
T years
3
T years
4
T
0
2T
Time Since Start of Trial (T years)
- Each patient has T years of follow-up
- Time for follow-up taking place may be different
for each patient
7Subject
o
Administrative Censoring
1
Failure
2
Censoring Loss to Follow-up
3
4
T
2T
0
Time Since Start of Trial (T years)
- Failure time is time from entry until the time of
the event - Censoring means vital status of patient is not
known beyond that point
8Subject
Administrative Censoring
o
1
Failure
2
3
Censoring Loss to Follow-up
4
T
0
Follow-up Time (T years)
9Clinical Trial with Common Termination Date
Subject
o
1
2
3
4
o
5
6
7
8
9
o
o
10
o
11
o
o
0
T
2T
Trial Terminated
Follow-up Time (T years)
10Reduced Sample Estimate (1)
- Years of Cohort
- Follow-Up Patients I II Total
- Entered 100 100 200
- 1
- Died 20 25 45
- Entered 80 75 155
- 2
- Died 20
-
- Survived 60
11Reduced Sample Estimate (2)
- Suppose we estimate the 1 year survival rate
- a. P(1 yr) 155/200 .775
- b. P(1 yr, cohort I) 80/100 .80
- c. P(1 yr, cohort II) 75/100 .75
- Now estimate 2 year survival
- Reduced sample estimate 60/100 0.60
- Estimate is based on cohort I only
- Loss of information
12Actuarial Estimate (1)
- Ref Berkson Gage (1950) Proc of Mayo Clinic
- Cutler Ederer (1958) JCD
- Elveback (1958) JASA
- Kaplan Meier (1958) JASA
- - Note that we can express P(2 yr survival as
- P(2 yrs) P(2 yrs survivalsurvived 1st yr)
- P(1st yr survival)
- (60/80) (155/200)
- (0.75) (0.775)
- 0.58
- This estimate used all the available data
-
13Actuarial Estimate (2)
- In general, divide the follow-up time into a
series of intervals
- Let pi prob of surviving Ii given patient alive
at beginning of Ii (i.e. survived through Ii
-1) - Then prob of surviving through tk, P(tk)
14Actuarial Estimate (3)
- Define the following
ni number of subjects alive at beginning of
Ii (i.e. at ti-1) di number of deaths during
interval Ii li number of losses during
interval Ii (either administrative or lost to
follow-up) - We know only that di deaths and
losses occurred in Interval Ii
15Estimation of Pi
- a. All deaths precede all losses
-
- b. All losses precede all deaths
- Deaths and losses uniform,
- (1/2 deaths before 1/2 losses)
-
- Actuarial Estimate/Cutler-Ederer
- - Problem is that P(t) is a function of the
interval choice. - - For some applications, we have no choice, but
if we
16Actuarial Lifetime Method (1)
- Used when exact times of death are not known
- Vital status is known at the end of an interval
period (e.g. 6 months or 1 year) - Assume losses uniform over the interval
17Actuarial Lifetime Method (2)
Lifetable At Number Number Adjusted
Prop Prop. Surv. Up to Interval Risk
Died Lost No. At Risk Surviving End
of Interval (ni) (di)
(li) 0-1 50 9 0 50 41/50-0.82 0.82 1-2
41 6 1 41-1/240.5 34.5/40.50.852 0.852 x
0.820.699 2-3 34 2 4 34-4/232 30/320.937
0.937 x 0.6990.655 3-4 28 1 5 28-5/225.5 24
.5/25.50.961 0.961 x 0.6550.629 4-5 22 2 3
22-3/220.5 18.5/20.50.902 0.902 x 0.6290.567
18Actuarial Survival Curve
100 80 60 40 20 0
X ___
X___
X___
X___
X___
X___
1 2 3 4 5
19Kaplan-Meier Estimate (1)(JASA, 1958)
- Assumptions
-
- 1. "Exact" time of event is known
- Failure uncensored event
- Loss censored event
- 2. For a "tie", failure always before loss
- 3. Divide follow-up time into intervals such
that - a. Each event defines left side of an interval
- b. No interval has both deaths losses
20Kaplan-Meier Estimate (2)(JASA, 1958)
- Then
- ni at risk just prior to death at ti
-
- Note if interval contains only losses, Pi 1.0
- Because of this, we may combine intervals with
only losses with the previous interval containing - only deaths, for convenience
- Xooo
21Estimate of S(t) or P(t)
- Suppose that for N patients, there are K distinct
failure (death) times. The Kaplan-Meier estimate
of survival curves becomes P(t)P (Survival ? t) - K-M or Product Limit Estimate
- ti ? t i 1,2,,k
- where ni ni-1 - li-1 - di-1
- li-1 censored events since death
at ti-1 - di-1 deaths at ti-1
22Estimate of S(t) or P(t)
- Variance of P(t)
- Greenwoods Formula
23KM Estimate (1)
Example (see Table 14-2 in FFD) Suppose we
follow 20 patients and observe the event time,
either failure (death) or censored (), as
0.5, 0.6), 1.5, 1.5, 2.0), 3.0, 3.5,
4.0), 4.8, 6.2, 8.5, 9.0), 10.5, 12.0 (7
pts) There are 6 distinct failure or death
times 0.5, 1.5, 3.0, 4.8, 6.2, 10.5
24KM Estimate (2)
1. failure at t1 0.5 .5, 1.5) n1 20 d1
1 l1 1 (i.e. 0.6)
If t d .5, 1.5), p(t) p1 0.95 V
P(t1) .952 1/20(19) 0.0024
25KM Estimate (4)
Data 0.5, 0.6), 1.5, 1.5, 2.0),
3.0 etc. 2. failure at t2 1.5 n2 n1 - d1 -
R1 1.5, 3.0) 20 - 1 - 1 18 d2
2 R2 1 (i.e. 2.0) If t d 1.5, 3.0),
then P(t) (0.95)(0.89) 0.84 V P(t2)
0.842 1/20(19) 2/18(18-2) 0.0068
26Life Table 14.2 Kaplan-Meier Life Table for 20
Subjects Followed for One Year Interval Interval
Time Number of death nj dj Rj
.5,1.5) 1 .5 20 1 1 0.95 0.95 0.0024 1.5,3.0)
2 1.5 18 2 1 0.89 0.84 0.0068 3.0,4.8) 3 3.0
15 1 2 0.93 0.79 0.0089 4.8,6.2) 4 4.8 12 1 0
0.92 0.72 0.0114 6.2,10.5) 5 6.2 11 1 2 0.91 0.
66 0.0135 10.5, ) 6 10.5 8 1
7 0.88 0.58 0.0164
nj number of subjects alive at the beginning of
the jth interval dj number of subjects who
died during the jth interval Rj number of
subjects who were lost or censored during the jth
interval estimate for pj, the probability
of surviving the jth interval given that the
subject has survived the previous intervals
estimated survival curve
variance of Censored due to termination of
study
27Survival Curve Kaplan-Meier Estimate
1.0
o
0.9
o
0.8
o
o
Estimated Survival Cure P(t)
0.7
o
o
o
o
0.6
o
o
o
o
o
0.5
0
4
6
8
10
12
2
Survival Time t (Months)
28Comparison of Two Survival Curves
- Assume that we now have a treatment group and a
control group and we wish to make a comparison
between their survival experience - 20 patients in each group
- (all patients censored at 12 months)
- Control 0.5, 0.6, 1.5, 1.5, 2.0, 3.0, 3.5,
4.0, - 4.8, 6.2, 8.5, 9.0, 10.5, 12's
- Trt 1.0, 1.6, 2.4, 4.2, 4.5, 5.8, 7.0,
11.0, 12'S
29Kaplan-Meier Estimate for Treatment
- 1. t1 1.0 n1 20 p1 20 - 1 0.95
- d1 1 20
- l1 3
- p(t) .95
- 2. t2 4.5 n2 20 - 1 - 3 p2 16 - 1 0 .94
- 16 16
- d2 1
-
30Kaplan-Meier Estimate
1.0
o
TRT
0.9
o
0.8
o
o
Estimated Survival Cure P(t)
0.7
CONTROL
o
o
0.6
o
o
o
o
o
0.5
0
4
6
8
10
12
2
Survival Time t (Months)
31Comparison of Two Survival Curves
- Comparison of Point Estimates
- Suppose at some time t we want to compare PC(t)
for the control and PT(t) for treatment - The statistic
- has approximately, a normal distribution under
H0 - Example
32- Comparison of Overall Survival Curve
- H0 Pc(t) PT(t)
- A. Mantel-Haenszel Test
- Ref Mantel Haenszel (1959) J Natl Cancer
Inst - Mantel (1966) Cancer Chemotherapy Reports
- - Mantel and Haenszel (1959) showed that a series
of 2 x 2 - tables could be combined into a summary
statistic - (Note also Cochran (1954) Biometrics)
- - Mantel (1966) applied this procedure to the
comparison of - two survival curves
- - Basic idea is to form a 2 x 2 table at each
distinct death - time, determining the number in each group who
were at - risk and number who died
33Comparison of Two Survival Curves (1)
- Suppose we have K distinct times for a death
occurring - ti i 1,2, .., K. For each death time,
- Died At Risk
- at ti Alive (prior to ti)
- Treatment ai bi ai bi
- Control ci di ci di
- ai ci bi di Ni
- Consider ai, the observed number of
- deaths in the TRT group, under H0
34Comparison of Two Survival Curves(2)
E(ai) (ai bi)(ai ci)/Ni C Mantel-H
aenszel Statistic
35Table 14.3Comparison of Survival Data for a
Control Group and an Intervention Group Using the
Mantel-Haenszel Procedure
- Rank Event Intervention Control
Total - Times
- j tj aj bj aj ?j cj dj cj ?j aj cj bj
dj - 1 0.5 20 0 0 20 1 1 1 39
- 2 1.0 20 1 0 18 0 0 1 37
- 3 1.5 19 0 2 18 2 1 2 35
- 4 3.0 17 0 1 15 1 2 1 31
- 5 4.5 16 1 0 12 0 0 1 27
- 6 4.8 15 0 1 12 1 0 1 26
- 7 6.2 14 0 1 11 1 2 1 24
- 8 10.5 13 0 1 8 1 1 20
aj bj number of subjects at risk in the
intervention group prior to the death at time
tj cj cj number of subjects at risk in the
control group prior to the death at time tj aj
number of subjects in the intervention group
who died at time tj cj number of subjects in
the control group who died at time tj ?j
number of subjects who were lost or censored
between time tj and time tj1 aj cj number
of subjects in both groups who died at time
tj bj dj number of subjects in both groups
who are at risk minus the number who died at time
tj
36Mantel-Haenszel Test
- Operationally
- 1. Rank event times for both groups combined
- 2. For each failure, form the 2 x 2 table
- a. Number at risk (ai bi, ci di)
- b. Number of deaths (ai, ci)
- c. Losses (lTi, lCi)
- Example (See table 14-3 FFD) - Use previous data
set - Trt 1.0, 1.6, 2.4, 4.2, 4.5, 5.8, 7.0,
11.0, 12.0's - Control 0.5, 0.6, 1.5, 1.5, 2.0, 3.0, 3.5,
4.0, 4.8, 6.2, - 8.5, 9.0, 10.5, 12.0's
371. Ranked Failure Times - Both groups
combined 0.5, 1.0, 1.5, 3.0, 4.5, 4.8, 6.2,
10.5 C T C C T C C
C 8 distinct times for death (k 8) 2. At t1
0.5 (k 1) .5, .6, 1.0) T a1 b1 20
a1 0 lT1 0 c1 d1 20 c1 1 lC1
1 1 loss _at_ .6
D A R T 0 20 20 C 1 19 20 1 39 40
E(a1) 120/40 0.5 V(a1) 139 20 20
402 39
383. At t2 1.0 (k 2) 1.0, 1.5) T a2
b2 (a1 b1) - a1 - lT1 a2 1.0
20 - 0 - 0 20 lT2 0 C. c2
d2 (c1 d1) - c1 - lC1 c2 0 20
- 1 - 1 18 lC2 0 so
D A R T 1 19 20 C 0 18 18 1 37 38
39Eight 2x2 Tables Corresponding to the Event
TimesUsed in the Mantel-Haenszel Statistic in
Survival Comparison of Treatment (T) and Control
(C) Groups
1. (0.5 mo.) D A R 5. (4.5 mo.) D A R T 0 20
20 T 1 15 16 C 1 19 20 C 0 12 12 1 39 40 1
27 28 2. (1.0 mo) D A R 6. (4.8
mo.) D A R T 1 19 20 T 0 15 15 C 0 18 18 C 1 1
1 12 1 37 38 1 26 27 3. (1.5
mo.) D A R 7. (6.2 mo.) D A R T 0 19 19 T 0 14 1
4 C 2 16 18 C 1 10 11 2 35 37 1 24 25 4. (3
.0 mo.) D A R 8. (10.5 mo.) D A R T 0 17 17 T 0
13 13 C 1 14 15 C 1 7 8 1 31 32 1 20 21
Number in parentheses indicates time, tj, of a
death in either group Number of subjects who
died at time tj Number of subjects who are
alive between time tj and time tj1 Number of
subjects who were at risk before the death at
time tj RDA)
40Compute MH Statistics
Recall K 1 K 2 K 3 t1 0.5 t2 1.0 t3
1.5
D A 0 20 20 1 19 20 1 39 40
D A 1 19 20 0 18 18 1 37 38
D A 0 19 19 2 16 18 2 35 37
a. ai 2 (only two treatment deaths) b. E(ai
) 20(1)/40 20(1)/38 19(2)/37 . . .
4.89 c. V(ai) 2.22 d. MH (2 -
4.89)2/2.22 3.76 or ZMH
41B. Gehan Test (Wilcoxon) Ref Gehan, Biometrika
(1965) Mantel, Biometrics (1966) Gehan (1965)
first proposed a modified Wilcoxon
rank statistic for survival data with censoring.
Mantel (1967) showed a simpler computational
version of Gehans proposed test. 1. Combine
all observations XTs and XCs into a single
sample Y1, Y2, . . ., YNC NT 2. Define Uij
where i 1, NC NT j 1, NC NT
-1 Yi lt Yj and death at Yi Uij
1 Yi gt Yj and death at Yj 0
elsewhere 3. Define Ui i 1, , NC
NT
42Gehan Test
- Note
- Ui number of observed times definitely less
than i - number of observed times definitely greater
- 4. Define W S Ui (controls)
- 5. VW NCNT
- Variance due to Mantel
- 6.
- Example (Table 14-5 FFD)
- Using previous data set, rank all observations
43The Gehan Statistics, Gi involves the scores Ui
and is defined as G W2/V(W) where W ?Ui
(Uis in control group only) and
44Example of Gehan Statistics Scores Ui for
Intervention and Control (C) Groups
Observation Ranked Definitely
Definitely Ui i Observed Time Group
Less More 1 0.5 C 0 39 -39 2 (0.6)
C 1 0 1 3 1.0 I 1 37 -36 4 1.5 C 2 35
-33 5 1.5 C 2 35 -33 6 (1.6) I 4 0 4 7
(2.0) C 4 0 4 8 (2.4) I 4 0 4 9 3.0 C
4 31 -27 10 (3.5) C 5 0 5 11 (4.0) C 5
0 5 12 (4.2) I 5 0 5 13 4.5 I 5 27 -22
14 4.8 C 6 26 -20 15 (5.8) I 7 0 7 16
6.2 C 7 24 -17 17 (7.0) I 8 0 8 18 (8.5)
C 8 0 8 19 (9.0) C 8 0 8 20 10.5 C 8
20 -12 21 (11.0) I 9 0 9
22-40 (12.0) 12I, 7C 9 0 9 Censored
observations
45Gehan Test
- Thus W (-39) (1) (-36) (-33) (4) . .
. . - -87
- and VW (20)(20) (-39)2 12 (-36)2
. . . - (40)(39)
- 2314.35
- so
- Note MH and Gehan not equal
-
46Cox Proportional Hazards Model
- Ref Cox (1972) Journal of the Royal Statistical
Association - Recall simple exponential
- S(t) e-lt
- More complicated
- If l(s) l, get simple model
- Adjust for covariates
- Cox PHM
- l(t,x) l0(t) ebx
47Cox Proportional Hazards Model
- So
- S(t1,X)
-
-
-
- Estimate regression coefficients (non-linear
estimation) b, SE(b) - Example
- x1 1 Trt
- 2 Control
- x2 Covariate 1
- indicator of treatment effect, adjusted
for x2, x3 , . . . - If no covariates, except for treatment group
(x1), - PHM logrank
48Homework Problem
1. Kaplan-Meier 2. Gehan-Wilcoxon 3.
Mantel-Haenszel
a D drug P placebo b In weeks c A alive D
dead
Source P.B. Gregory (1974)
49Survival Analysis Summary
- Time to event methodology very useful in multiple
settings - Can estimate time to event probabilities or
survival curves - Methods can compare survival curves
- Can stratify for subgroups
- Can adjust for baseline covariates using
regression model - Need to plan for this in sample size estimation
overall design