Analysis of Survival Data Time to an Event - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Analysis of Survival Data Time to an Event

Description:

... patient dies of opportunistic infection before engraftment; ALL patient dies in ... Non-Emergency. Total. Dead. Alive. Total. 24. 9. 33. 289. 100. 389. 313 ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 31
Provided by: Emp122
Category:
Tags: analysis | data | dies | event | non | survival | time

less

Transcript and Presenter's Notes

Title: Analysis of Survival Data Time to an Event


1
Analysis of Survival Data(Time to an Event)
Paula K. Roberson, Ph.D. Biostatistics GCRC
Research Education and Training Seminar March 11,
2005
2
Outline
  • What is survival data?
  • Why do we need special methods?
  • Assumptions about censoring
  • Estimating survival curves
  • Comparing survival curves
  • Incorporating covariates and prognostic factors

3
Methods are called survival analysis for
historical reasons, but are useful for analyzing
time to events other than death--e.g., time
to relapse (pediatric ALL) time to neutropenia
(bactrim vs. amoxicillin for otitis media,
serial WBCs) time to pregnancy (infertility
studies) time to palpable tumor (animal
carcinogenicity studies)
Why are standard methods of estimation (i.e.
sample mean/median) and analysis (t-tests,
chi-square, linear regression) inadequate for
these situations?
4
(No Transcript)
5
Censoring
Censoring occurs when a subject is observed for
some period of time without the event of interest
(death, relapse, bone marrow engraftment, etc.)
occurring.
  • Censoring may result from
  • Loss to follow-up
  • Follow-up ends before event occurs
  • Competing risks -- e.g. bone marrow transplant
    patient dies of opportunistic infection before
    engraftment ALL patient dies in automobile
    accident before relapsing

6
When the prolonged observation of an individual
is not necessary to assess occurrence of the
event (as in surgical mortality), 2x2
contingency chi-square analysis may be used to
assess differences in survival between groups
of subjects.
Example
Chi-Square 0.04 Degrees of Freedom (2-1)(2-1)
1 p 0.084
7
Assumption The censoring process is
independent of the event (failure)
process. --violations can be subtle, e.g.,
patients might drop out of a study because
advanced disease makes them feel they are too
weak or ill to continue
8
How do we account for the partial information
provided by censored observations? With time
measured (approximately) continuously (in days or
weeks) Kaplan-Meier plots (other names
actuarial curves, product limit curves, survival
curves) (if event times are grouped into
larger time intervals such as years of decades,
use special but similar methods).
9
Basis Probability of surviving 2 days is
probability of surviving day 2 given survival of
day 1, multiplied by the probability of
surviving day 1. Probability of surviving 3 days
is probability of surviving day 3 given survival
of day 2, multiplied by the probability of
surviving day 2 (see above). Etc.
10
In Notation P(surviving t days) P(surviving
day t survived day t-1) P(surviving day t-1
survived day t-2) P(surviving day t-2survived
day t-3)

P(surviving day
3survived day 2) P(surviving day 2survived
day 1) P(surviving day 1)
11
Example Remission time of acute leukemia
Patients randomly assigned Purpose evaluate
drugs ability to maintain
remissions Study terminated after 1 year
Different follow up times due to sequential
enrollment 6-MP 6,6,6,7,10,22,23,6,9,1
0,11,17,19, 20,25,32,32,34,35 Placebo 1
,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23
12
Statistic of Interest t-year survival rate
(in this case, actually, remission duration in
weeks) number of individuals relapse-free longer
than t weeks total number of individuals in data
set Without censoring - Placebo group 10-wk
remission duration rate 8/21 X 100
38.1 What to do about censoring? Product -
limit or Kaplan Meier method for estimating
survival rates
13
How to Calculate Survival Rate??
Column 1
Column 2
Column 3
Column 4
Column 5
Ranks 1 to n
Ranks for uncensored observations only
Time t survival rate multiply values in Col. 4
up to and including t
14
(No Transcript)
15
(No Transcript)
16
Observations
  • The Kaplan Meier curve is a step function --
    i.e., it does not change on days when no events
    occur. Step sizes are not all equal they depend
    on changes in denominator.
  • Even with heavy censoring, the Kaplan-Meier curve
    is an unbiased estimate of the true (population)
    survival curve. Censoring affects the precision
    but not the accuracy (bias).
  • Censoring must be independent of occurrence of
    endpoint for estimate to be unbiased.

17
Observations (Cont.)
  • If there is no censoring, the Kaplan-Meier
    estimate is the same as the simple observed
    proportion surviving. For example, if there
    are 100 observations and no censoring the curve
    will have the value 0.99 between the first and
    second failure times (assuming only one
    individual failed at time 1). If there are two
    failures at time 2 (3 failures now), the curve
    will be 0.97 between times 2 and 3.
  • Dont over interpret plateaus!

18
(No Transcript)
19
(No Transcript)
20
  • There are estimators of variance of the
    Kaplan-Meier estimate at any time point t. These
    can be used to calculate a confidence interval
    for the proportion surviving at time t (i.e., a
    five-year survival rate for breast cancer
    patients).
  • There are statistical tests for comparing the
    survival durations in two or more groups. The
    most frequently used are the log-rank test and
    the Gehan test. Both have test statistics that
    are compared to critical values of the chi-square
    distribution.

21
Covariates and Prognostic Factors
  • Regression models for survival data allow us
    to
  • evaluate more than one risk factor at a time
  • evaluate relative treatment effects while
    controlling for potential confounding factors
  • investigate interactive effects among factors
  • They are not a panacea for flawed study designs!!
  • The model most often used is the proportional
    hazards model developed by Cox in 1972, often
    referred to simply as the Cox model.

22
  • The hazard (instantaneous failure rate) function
    is more conventional mathematically than the
    survivor function. There is a one-one
    relationship between the functions, so
    identifying factors which affect the hazard
    identifies those factors which affect survival.
  • A strong advantage of the proportional hazards
    model is that we do not need to make assumptions
    about the form of the failure time distribution
    for a given set of covariate values. We do
    assume, however, that the covariate value has the
    same proportional effect on increasing or
    decreasing an individuals hazard relative to the
    baseline, regardless of time.

23
  • We estimate the regression parameters ß using a
    principle called maximum likelihood. We use what
    we know about the asymptotic (infinite sample
    size) behavior of these estimates to make
    inferences about our finite samples. Clearly,
    the smaller our sample size, the more
    questionable are our approximations.
  • In practice, we usually dont do to badly.

24
Rule of Thumb
Need 10 times as many observed events as factors
in the model. e.g., 3 factors, 30 events The
distribution across categories is important as
well as the total sample size. For example, if
failure to thrive (FTT) is a factor you wish to
control for but you only have two patients out of
your sample of 100 who have FTT, the estimated
effect of FTT will be unreliable.
25
Why Study Prognostic Factors?
1. To learn about natural history of disease 2.
To adjust for imbalances in comparing
treatments 3. To aid in designing future
studies 4. To look for treatment-covariate
interaction 5. To predict outcome for
individual patients 6. To intervene in the
course of disease 7. To explain variation and
detect interaction --Byar (in Buyse, et al)
26
How do we identify prognostic factors? A.
Initial screening B. Developing multivariate
models
27
Developing Multivariate Models
  • We draw conclusions about the importance of the
    factor in question by making inferences about the
    magnitude and sign (/-) of the regression
    coefficient associated with that factor.
  • Because of inter-relationships among the
    prognostic factors, the values of the
    corresponding regression coefficients (and hence,
    their statistical significance) will depend on
    what other factors are in the model.
  • The purpose of the modeling determines the
    modeling strategy.

28
All models are wrong, but some are
useful. --J. Tukey
29
All who drink of this remedy recover in a short
time except those whom it does not help, who
all die. Therefore, it is obvious that it fails
only in incurable cases. attributed to
Galen, 2nd century
30
the objectives of science in medicine are
merely to set limits to our ignorance rather
than providing us with certainty in all
therapeutic decision-making. --Baum, Houghton
and Abrams Statistics in Medicine 131465,
1994.
Write a Comment
User Comments (0)
About PowerShow.com