Title: Survival Analysis
1Survival Analysis
- Bandit Thinkhamrop, PhD. (Statistics)
- Department of Biostatistics and Demography
- Faculty of Public Health, Khon Kaen University
2Begin at the conclusion
Begin at the conclusion
7
3Type of the study outcome Key for selecting
appropriate statistical methods
- Study outcome
- Dependent variable or response variable
- Focus on primary study outcome if there are more
- Type of the study outcome
- Continuous
- Categorical (dichotomous, polytomous, ordinal)
- Numerical (Poisson) count
- Event-free duration
4The outcome determine statistics
Mean Median
Proportion (Prevalence Or Risk)
Rate per space
Median survival Risk of events at T(t)
Linear Reg.
Logistic Reg.
Poisson Reg.
Cox Reg.
5Statistics quantify errors for judgments
6Back to the conclusion
Appropriate statistical methods
Mean Median
Proportion (Prevalence or Risk)
Rate per space
Median survival Risk of events at T(t)
Magnitude of effect 95 CI P-value
Answer the research question based on lower or
upper limit of the CI
7Study outcome
- Survival outcome event-free duration
- Event (1Yes 0Censor)
- Duration or length of time between
- Start date ()
- End date ()
- At the start, no one had event (event 0) at
time t(0) - At any point since the start, event could occur,
hence, failure (event 1) at time t(t) - At the end of the study period, if event did not
occur, hence, censored (event 0) - Thus, the duration could be either
time-to-event or time-to-censoring
8Censoring
- Censored data incomplete time to event data
- In the present of censoring, the time to event
is not known - The duration indicates there has been no event
occurred since the start date up to last date
assessed or observed, a.k.a., the end date. - The end date could be
- End of the study
- Last observed prior to the end of the study due
to - Lost to follow-up
- Withdrawn consent
- Competing events occurred, prohibiting
progression to the event under observation - Explanatory variables changed, irrelevance to
occurrence of event under observation
9- Magnitude of effects
- Median survival
- Survival probability
- Hazard ratio
10SURVIVAL ANALYSIS
- Study aims
- Median survival
- Median survival of liver cancer
- Survival probability
- Five-year survival of liver cancer
- Five-year survival rate of liver cancer
- Hazard ratio
- Factors affecting liver cancer survival
- Effect of chemotherapy on liver cancer survival
11SURVIVAL ANALYSIS
Event
Dead, infection, relapsed, etc
Negative
Cured, improved, conception, discharged, etc
Positive
Neutral
Smoking cessation, ect
12Natural History of Cancer
13Accrual, Follow-up, and Event
ID
Begin the study
End of the study
1
2
3
Dead
Dead
4
5
6
Start of accrual
End of accrual
End of follow-up
Recruitment period
Follow-up period
14Time since the beginning of the study
0 1
2 3
4
ID
48 months
1
22 months
2
14 months
3
Dead
40 months
Dead
4
26 months
5
13 months
6
The data gt48 gt22 14
40 gt26 gt13
15DATA
ID SURVIVAL TIME OUTCOME AT
THE END EVENT (Months) OF THE STUDY
1 48 Still alive at the end of the
study Censored 2 22 Dead due to
accident Censored 3 14 Dead caused by the
disease under investigation Dead 4 40 Dead
caused by the disease under investigation
Dead 5 26 Still alive at the end of the
study Censored 6 13 Lost to
follow-up Censored
16DATA
ID TIME EVENT
ID TIME EVENT
1 48 Censored 2 22 Censored 3 14
Dead 4 40 Dead 5 26 Censored 6 13
Censored
1 48 0 2 22 0 3 14 1 4
40 1 5 26 0 6 13 0
17ANALYSIS
ID TIME EVENT
Prevalence 2/6
1 48 0 2 22 0 3 14 1 4
40 1 5 26 0 6 13 0
Incidence density 2/163 person-months
Proportion of surviving at month t
Median survival time
18RESULTS
ID TIME EVENT
Incidence density 1.2 per100 person-months
(95CI 0.1 to 4.4)
1 48 0 2 22 0 3 14 1 4
40 1 5 26 0 6 13 0
Proportion of surviving at 24 month 80
(95CI 20 to 97)
Median survival time 40 Months
(95CI 14 to 48)
19Type of Censoring
- Left censoring When the patient experiences the
event in question before the beginning of the
study observation period. - Interval censoring When the patient is followed
for awhile and then goes on a trip for awhile and
then returns to continue being studied. - Right censoring
- single censoring does not experience event
during the study observation period - A patient is lost to follow-up within the study
period. - Experiences the event after the observation
period - multiple censoring May experience event multiple
times after study observation ends, when the
event in question is not death.
20Summary description of survival data setstdes
- This command describes summary information about
the data set. It provides summary statistics
about the number of subjects, records, time at
risk, failure events, etc.
21Computation of S(t)
- Suppose the study time is divided into periods,
the number of which is designated by the letter,
t. - The survivorship probability is computed by
multiplying a proportion of people surviving for
each period of the study. - If we subtract the conditional probability of the
failure event for each period from one, we obtain
that quantity. - The product of these quantities constitutes the
survivorship function.
22Kaplan-Meier Methods
23Kaplan-Meier survival curve
24Median survival time
25Survival Function
- The number in the risk set is used as the
denominator. - For the numerator, the number dying in period t
is subtracted from the number in the risk set.
The product of these ratios over the study time
26Survival experience
27Survival curve more than one group
28Comparing survival between groups
ID TIME DEAD DRUG
1 48 0 1
2 22 0 1
3 14 1 1
4 40 1 1
5 26 0 1
6 13 0 1
7 13 0 0
8 6 1 0
9 12 1 0
10 14 1 0
11 22 1 0
12 13 1 0
29Kaplan-Meier surve
30Log-rank test
- t Time
- n Number at risk for both groups at time t
- n1 Number at risk for group 1 at time t
- n2 Number at risk for group 2 at time t
- d Dead for both groups at time t
- c Censored for both groups at time t
- O1 Number of dead for group 1 at time t
- O2 Number of dead for group 2 at time t
- E1 Number of expected dead for group 1 at time
t - E2 Number of expected dead for group 2 at time
t
31Log-rank test example
- DRUG1 48, 22, 26, 13,14,40
- DRUG0 13, 6, 12, 14, 22, 13
32Hazard Function
33Survival Function vs Hazard Function
H(t) -ln(S(t)) (S(t)) EXP(-H(t))
34Hazard rate
- The conditional probability of the event under
study, provided the patient has survived up to an
including that time period - Sometimes called the intensity function, the
failure rate, the instantaneous failure rate
35Formulation of the hazard rate
The HR can vary from 0 to infinity. It can
increase or decrease or remain constant over
time. It can become the focal point of much
survival analysis.
36Cox Regression
- The Cox model presumes that the ratio of the
hazard rate to a baseline hazard rate is an
exponential function of the parameter vector. - h(t) h0(t) ? EXP(b1X1 b2X2 b3X3 . . .
bpXp )
37Hazard ratio
38Testing the Adequacy of the model
- We save the Schoenfeld residuals of the model and
the scaled Schoenfeld residuals. - For persons censored, the value of the residual
is set to missing.
borrowed from Professor Robert A. Yaffee
39A graphical test of the proportion hazards
assumption
- A graph of the log hazard would reveal 2 lines
over time, one for the baseline hazard (when x0)
and the other for when x 1 - The difference between these two curves over time
should be constant B
If we plot the Schoenfeld residuals over the line
y0, the best fitting line should be parallel to
y0.
borrowed from Professor Robert A. Yaffee
40Graphical tests
- Criteria of adequacy
- The residuals, particularly the rescaled
residuals, plotted against time should show no
trend(slope) and should be more or less constant
over time.
borrowed from Professor Robert A. Yaffee
41Other issues
- Time-Varying Covariates
- Interactions may be plotted
- Conditional Proportional Hazards models
- Stratification of the model may be performed.
Then the stphtest should be performed for each
stratum.
borrowed from Professor Robert A. Yaffee
42Suggested Readings for beginners
43Suggested Readings for advanced learners
44Survival analysis in practice
- What is the type of research question that
survival analysis should be used?
45Stata for one-group survival analysis
- stset time, failure(event)
- stdescribe
- tab event
- stsum
- strate
- stci
- sts list, at(12 24)
46Stata for one-group survival analysis (cont.)
- sts g
- sts g, atrisk
- sts g, lost
- sts g, enter
- sts g, risktable
- sts g, cumhaz
- sts g, cumhaz ci
- sts g, hazard
47Stata for multiple-group survival analysis
- stset time, failure(event)
- stdescribe
- stsum, by(group)
- sts test group
- sts test group, wilcoxon
- strate group
- stci , by(group)
- sts g, by(group) atrisk
- sts g, by(group) risktable
- sts g, by(group) cumhaz lost
- sts g, by(group) hazard ci
48Stata for multiple-group survival analysis
- sts list, , by(group) at(12 24)
- sts list, , by(group) at(12 24) compare
- ltable group, interval()
- ltable group, graph
- ltable group, hazard
- stmh group
- stmh group, by(strata)
- stmc group
- stcox group
- stir group
49Stata for Model Fitting
- Continuous covariate
- xtile newvar varlist , nq(4)
- tabstat varlist, stat(n min max) by(newvar)
- xistcox i.newvar
- stsum, by(newvar)
- Categorical covariate
- tab exposure outcome, col
- xistcox i.exposure
50Sample size for Cox Model
- stpower cox, failprob(.2) hratio(0.1 0.3) sd(.3)
r2(.1) power(0.8 0.9) hr