A Brief Introduction to Survival Analysis - PowerPoint PPT Presentation

1 / 28

About This Presentation

Title:

A Brief Introduction to Survival Analysis

Description:

other than death - e.g., - time to relapse (pediatric ALL studies) ... Non-Emergency. Total. Dead. Alive. Total. 24. 9. 33. 289. 100. 389. 313. 109. 422. Chi ... – PowerPoint PPT presentation

Number of Views:137

Avg rating:3.0/5.0

Slides: 29

Provided by: UAMS

Category:

more less

Transcript and Presenter's Notes

Title: A Brief Introduction to Survival Analysis

1
A Brief Introduction to Survival Analysis
Page C. Moore, Ph.D. Department of
Biostatistics GCRC Clinical Research
Course October 12, 2006
2
Outline and Course Objectives

What are survival data?
Why do we need special methods?
Assumptions about censoring
Estimating survival curves
Comparing survival curves
Incorporating covariates and prognostic factors

3
Methods are called survival analysis for
historical reasons, but are useful for analyzing
time to events other than death - e.g., -
time to relapse (pediatric ALL studies) - time
to pregnancy (infertility studies) - time to
developmental milestones (infant studies related
to size at birth) - time to divorce (marital
studies) - time to drop-out (high school
retention studies)
Why are standard methods of estimation (i.e.
sample mean/median) and analysis (t-tests,
chi-square, linear regression) inadequate for
these situations?
4
Censoring
5
Censoring
Censoring occurs when a subject is observed for
some period of time without the event of interest
(death, relapse, bone marrow engraftment, etc.)
occurring.

Censoring may result from
Loss to follow-up
Follow-up ends before event occurs
Competing risks (e.g. bone marrow transplant
patient dies of opportunistic infection before
engraftment ALL patient dies in automobile
accident before relapsing)

6
When the prolonged observation of an individual
is not necessary to assess occurrence of the
event Example Surgical mortality Statistical
Analysis 2x2 contingency chi-square analysis
may be used to assess differences in survival
between groups of subjects.
Chi-Square 0.04 Degrees of Freedom (2-1)(2-1)
1 p 0.084
7
Assumption -

The censoring process is independent of the event
(failure) process.
Violations can be subtle,
e.g., patients might drop out of a study because
advanced disease makes them feel they are too
weak or ill to continue

How can we account for partial information
provided by censored observations?
Time measured (approximately) continuous (e.g.,
days or weeks)
Kaplan-Meier plots (a.k.a. - actuarial curves,
product limit curves, survival curves)
Event times are grouped into larger time
intervals (e.g., years of decades)
use special but similar methods

9
Basis - Survival Rates Probability of surviving
2 days is probability of surviving day 2 given
survival of day 1, multiplied by the probability
of surviving day 1. Probability of surviving 3
days is probability of surviving day 3 given
survival of day 2, multiplied by the probability
of surviving day 2 (see above). . . . Etc.
10
Survival Rates - In Notation P(surviving t
days) P(surviving day t survived day
t-1)P(surviving day t-1 survived day
t-2)P(surviving day t-2survived day t-3) .
. . P(surviving day 3survived day
2)P(surviving day 2survived day 1)P(surviving
day 1)
11
Example Remission time of acute leukemia

Purpose evaluate drugs ability to maintain
remissions
Patients randomly assigned
Study terminated after 1 year
Different follow up times due to sequential
enrollment
6-MP
6,6,6,7,10,22,23,6,9,10,11,17,19,20,25,32
,32,34,35
Placebo
1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23

12
Example Remission time of acute leukemia

Statistic of Interest t-year survival rate
(weeks)
number of individuals relapse-free longer
than t weeks
total number of individuals in data set
Without censoring - Placebo group
10-wk remission duration rate 8/21 X 100
38.1
What can we do about censoring?
Kaplan Meier (Product limit) method for
estimating survival rates

13
How can I Calculate a Survival Rate??
Column 1
Column 2
Column 3
Column 4
Column 5
Ranks 1 to n
Ranks for uncensored observations only
Time t survival rate multiply values in Col. 4
up to and including t
14
(No Transcript)
15
(No Transcript)
16
Comments and Observations

The Kaplan Meier curve is a step function (i.e.,
it does not change on days when no events occur).
Step sizes are not all equal they depend on
changes in denominator.
Even with heavy censoring, the Kaplan-Meier curve
is an unbiased estimate of the true (population)
survival curve. Censoring affects the precision
but not the accuracy (bias).
Censoring must be independent of occurrence of
endpoint for estimate to be unbiased.

17
Comments and Observations

If there is no censoring, the Kaplan-Meier
estimate is the same as the simple observed
proportion surviving.
For example, if there are 100 observations and no
censoring the curve will have the value 0.99
between the first and second failure times
(assuming only one individual failed at time 1).
If there are two failures at time 2 (3 failures
now), the curve will be 0.97 between times 2 and
3.
Dont over interpret plateaus!

18
(No Transcript)
19
(No Transcript)
20
Additional Comments

There are estimators of variance of the
Kaplan-Meier estimate at any time point t. These
can be used to calculate a confidence interval
for the proportion surviving at time t (i.e., a
five-year survival rate for breast cancer
patients).
There are statistical tests for comparing the
survival durations in two or more groups. The
most frequently used are the log-rank test and
the Gehan test. Both have test statistics that
are compared to critical values of the chi-square
distribution.

21
Covariates and Prognostic Factors

Regression models for survival data allow us
to

- Evaluate more than one risk factor at a
time - Evaluate relative treatment effects while
controlling for potential confounding
factors investigate interactive effects among
factors

They are not a panacea for flawed study designs!!
The model most often used is the proportional
hazards model developed by Cox in 1972
- often referred to simply as the Cox model.

22
Covariates and Prognostic Factors

The hazard (instantaneous failure rate) function
is more conventional mathematically than the
survivor function.
- There is a one-one relationship between
the functions, so identifying factors which
affect the hazard identifies those factors which
affect survival.
A strong advantage of the proportional hazards
model is that we do not need to make assumptions
about the form of the failure time distribution
for a given set of covariate values.
- However, it is assumed that the covariate
value has the same proportional effect on
increasing or decreasing an individuals hazard
relative to the baseline, regardless of time.

23
Covariates and Prognostic Factors

We estimate the regression parameters ß using a
principle called maximum likelihood.
We use what we know about the asymptotic
(infinite sample size) behavior of these
estimates to make inferences about our finite
samples.
- Clearly, the smaller our sample size, the more
questionable are our approximations.
In practice, we usually dont do too badly.

24
Rule of Thumb Sample Size

Need 10 times as many observed events as
factors in the model. (e.g., 3 factors 30
observed events, 10 events for each factor)
The distribution across categories is important
as well as the total sample size.
For example, if failure to thrive (FTT) is a
factor you wish to control for but you only have
two patients out of your sample of 100 who have
FTT, the estimated effect of FTT will be
unreliable.

25
Why Study Prognostic Factors?
1. To learn about natural history of disease 2.
To adjust for imbalances in comparing
treatments 3. To aid in designing future
studies 4. To look for treatment-covariate
interaction 5. To predict outcome for
individual patients 6. To intervene in the
course of disease 7. To explain variation and
detect interaction --Byar (in Buyse, et
al, 1988)
26

How Do We Identify Prognostic Factors?
A. Initial screening
Developing multivariate models

27
Developing Multivariate Models

We draw conclusions about the importance of the
factor in question by making inferences about the
magnitude and sign (/-) of the regression
coefficient associated with that factor.
Because of inter-relationships among the
prognostic factors, the values of the
corresponding regression coefficients (and hence,
their statistical significance) will depend on
what other factors are in the model.
The purpose of the modeling determines the
modeling strategy.

28
Additional Resources

Text
Kleinbaum, D.G. and Klein, M., Survival Analysis
A Self Learning Text, Springer, New York 2005.
Klein, J.P. and Moeschberger, M.L., Survival
Analysis, Springer, New York 2005.
Computer Software
SAS (http//www.sas.com/)
S-plus (http//www.splus.com/)
NCSS (http//www.ncss.com/download.html)

Write a Comment

User Comments (0)