Introduction to Biostatistics II - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Introduction to Biostatistics II

Description:

Consider the probability that an individual younger than 40 years of age in the ... This individual must have survived up to the six-month point and then expired a ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 29
Provided by: Constantin86
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Biostatistics II


1
Introduction to Biostatistics II
  • Survival analysis

2
Survival analysis
  • The outcome is survival.
  • In general this is, the outcome is time to an
    event
  • e.g., response
  • failure
  • death
  • pregnancy
  • infection

3
Survival curves for three population groups
4
USA life table 1979-1981
5
Survival curve for the US population, 1979-1981
6
Hemophiliac data example
7
Estimates of the survival curve
  • Consider the probability that an individual
    younger than 40 years of age in the previous data
    set will die at time t6 months after initiation
    of observation.
  • This individual must have survived up to the
    six-month point and then expired a short time
    after that, at time point t?, where ? symbolizes
    a very small unit of time. How will the
    probability of a death at six months be
    calculated?
  • Recall the definition of the conditional
    probability of an event B given an event A
  • which results in the multiplicative law of
    probability

8
Estimates of the survival curve (contd)
  • Considering two events ASurvive up to time t
    and B Failed at time t? (i.e., shortly after
    t).
  • The event S(t?)Survived up to time t? is
  • i.e., the probability of surviving up to time t?
    is equal to the probability of surviving up to t
    times the probability of failing at t? given
    survival up to t.

9
The life table method
  • The life-table method of estimation of the
    survival curve works as follows
  • Splits the time scale into J time intervals of
    the type tj-1-tj for j1,?,J
  • The number of people dying in each interval is dj
  • The number of people alive at the beginning of
    the interval (number at risk) is rj

10
Derivation of the life table method
  • To derive the life table estimate of the survival
    distribution we need to estimate the following
    quantities
  • Conditional probability of dying at interval j
    given survival up to j
  • P(BjAj)?qt dj/rj
  • Thus, probability of survival up to j
  • The life-table estimate of the survival
    distribution is constructed as follows

11
Life table of under-40 hemophiliac data
This subfile contains 12 observations
Life Table Survival Variable survival
Number Number Number Number
Cumul Intrvl Entrng Wdrawn Exposd of
Propn Propn Propn Proba- Start this
During to Termnl Termi- Sur- Surv
bility Hazard Time Intrvl Intrvl Risk
Events nating viving at End Densty
Rate ------ ------ ------ ------ ------
------ ------ ------ ------ ------ .0
12.0 .0 12.0 2.0 .1667 .8333
.8333 .0333 .0364 5.0 10.0 .0
10.0 3.0 .3000 .7000 .5833 .0500
.0706 10.0 7.0 .0 7.0 1.0
.1429 .8571 .5000 .0167 .0308 15.0
6.0 .0 6.0 3.0 .5000 .5000
.2500 .0500 .1333 20.0 3.0 .0
3.0 .0 .0000 1.0000 .2500 .0000
.0000 25.0 3.0 .0 3.0 1.0
.3333 .6667 .1667 .0167 .0800 30.0
2.0 .0 2.0 2.0 1.0000 .0000
.0000 These calculations
for the last interval are meaningless. The
median survival time for these data is 15.00
12
Life table estimate of the survival distribution
13
The Kaplan-Meier method
  • The K-M method differs from the life-table method
    in that it separates the time spectrum according
    to failure times (instead of fixed-width
    intervals).
  • The first interval is (0 2) (2 is the time of
    the first failure) when 1/12 individuals failed
    (died) so 11/12 survived. The survival estimate
    at t2 is, S(2)11/120.9167.
  • The second interval is (2 3) (the second
    failure happens at t3) when 1/11 individuals
    fails. The survival estimate at t3 is
    S(3)(11/12)(10/11)(10/12)0.8333, since to
    survive up to t3 you must survive up to t2 and
    (given that you survived up to t2) then survive
    beyond t3.
  • And so on

14
The product-limit Method
  • Nothing happens except at the time of failure.

Survival Analysis for survival Time
Status Cumulative Standard
Cumulative Number
Survival Error Events
Remaining 2 Selected .9167
.0798 1 11 3
Selected .8333 .1076
2 10 6 Selected
3 9
6 Selected .6667 .1361
4 8 7 Selected
.5833 .1423 5
7 10 Selected .5000
.1443 6 6 15
Selected
7 5 15 Selected
.3333 .1361 8 4
16 Selected .2500 .1250
9 3 27 Selected
.1667 .1076 10
2 30 Selected .0833
.0798 11 1 32
Selected .0000 .0000
12 0 Number of Cases 12
Censored 0 ( .00) Events 12
Survival Time Standard Error 95
Confidence Interval Mean 14
3 ( 8, 20 ) Median
10 5 ( 1,
19 ) Percentiles
25.00 50.00 75.00
Value 16.00 10.00
6.00 Standard Error 9.00 4.62
2.45
15
Kaplan-Meier estimate of the survival distribution
  • This plot is the Kaplan-Meier estimate of the
    hemophiliac-patient survival distribution
    corresponding to the previous output.

16
Censoring
  • When failure has not been observed, then the only
    information from the data is that the failure
    time is no less than the time of the last
    available observation (e.g., clinical visit).
    This is easily incorporated into the estimation
    procedure.
  • For example, consider the following data where
    subjects 2 and 6 completed observation without
    failure at months 3 and 10 (censor0 means
    censoring)

17
Life table method in the presence of censoring
  • To carry out the life-table estimate of the
    survival distribution, when data include censored
    observations, we include the number of censored
    observations in interval j.
  • cj is the number of censored observations in
    interval j
  • Since we do not know when exactly the censoring
    occurred we have the following options for
    calculating the number of individuals surviving
    up to j
  • at the beginning of the interval (so the number
    at risk at the beginning of interval j is
    r'jrj-cj)
  • at the end of the interval (so the number at risk
    is r'jrj)
  • at the middle of the interval (assuming that
    censoring happens uniformly through the interval,
    so r'jrj-cj/2).
  • The latter case is called the actuarial estimator
    of survival.

18
Derivation of the life-table method
  • To calculate the life table method for the period
    between 5 and 10 (interval j1) months in our
    example we proceed as follows
  • There is one failure and one censored observation
    in the first interval (i.e., between 0 and 5
    months). Assuming that the censoring happened at
    the midpoint of the interval (actuarial survival)
    the (effective) number at risk is
    r'1(r1-c1/2)11.5.
  • Thus, ?q11/11.50.0870, so
  • S(1)?q10.9130
  • For the second interval (j2, time between 5 and
    10 months) we have that three failures occurred
    with no censoring thus after removing the first
    failure and censored observation r'2r210 and
    ?q23/100.3000, so

19
Analysis via the life-table method
  • Life Table
  • Survival Variable survival
  • Number Number Number Number
    Cumul
  • Intrvl Entrng Wdrawn Exposd of Propn
    Propn Propn Proba-
  • Start this During to Termnl Termi-
    Sur- Surv bility Hazard
  • Time Intrvl Intrvl Risk Events nating
    viving at End Densty Rate
  • ------ ------ ------ ------ ------ ------
    ------ ------ ------ ------
  • .0 12.0 1.0 11.5 1.0 .0870
    .9130 .9130 .0174 .0182
  • 5.0 10.0 .0 10.0 3.0 .3000
    .7000 .6391 .0548 .0706
  • 10.0 7.0 1.0 6.5 .0 .0000
    1.0000 .6391 .0000 .0000
  • 15.0 6.0 .0 6.0 3.0 .5000
    .5000 .3196 .0639 .1333
  • 20.0 3.0 .0 3.0 .0 .0000
    1.0000 .3196 .0000 .0000
  • 25.0 3.0 .0 3.0 1.0 .3333
    .6667 .2130 .0213 .0800
  • 30.0 2.0 .0 2.0 2.0 1.0000
    .0000 .0000
  • These calculations for the last interval
    are meaningless.
  • The median survival time for these data is
    17.18

20
Life table estimate of the survival distribution
in the presence of censoring
21
K-M estimate in the presence of censoring
  • Consider how censoring is handled in the K-M
    procedure
  • In the first interval (time 0-2) one out of 12
    individuals fails at 2 months so that the
    estimate of survival at t2 is
  • No one fails at t3 months (second interval).
  • At t6 months two total subjects have failed out
    of the remaining ten (since one subject was
    censored at 3 months and is no longer part of the
    at-risk sample at six months), so (1q6 0.2000)
    is the probability of failure at t6 months. The
    estimate of the survival distribution is
  • S(6) S(2)(1-1q6) 0.9167(1-0.2000)0.7333
  • So, censored observations are present up to the
    interval where they are censored and disappear
    after that.

22
Kaplan-meier estimate with censored observations
  • Survival Analysis for survival
  • Time Status Cumulative Standard
    Cumulative Number
  • Survival Error
    Events Remaining
  • 2 1.00 .9167 .0798
    1 11
  • 3 .00
    1 10
  • 6 1.00
    2 9
  • 6 1.00 .7333 .1324
    3 8
  • 7 1.00 .6417 .1441
    4 7
  • 10 .00
    4 6
  • 15 1.00
    5 5
  • 15 1.00 .4278 .1565
    6 4
  • 16 1.00 .3208 .1495
    7 3
  • 27 1.00 .2139 .1325
    8 2
  • 30 1.00 .1069 .1005
    9 1
  • 32 1.00 .0000 .0000
    10 0

23
The K-M plot with censoring
  • The K-M estimate of the survival distribution in
    the presence of censoring is as shown in the
    figure.

24
Testing
  • Consider the survival curves of hemophiliacs
    contracting AIDS above 40 years of age and before
    40 years of age.

25
Survival distribution of gt40 year-olds
  • Survival Analysis for survival
  • Factor age gt40
  • Time Status Cumulative Standard
    Cumulative Number
  • Survival Error
    Events Remaining
  • 1 1.00
    1 8
  • 1 1.00
    2 7
  • 1 1.00
    3 6
  • 1 1.00 .5556 .1656
    4 5
  • 2 1.00 .4444 .1656
    5 4
  • 3 1.00
    6 3
  • 3 1.00 .2222 .1386
    7 2
  • 9 1.00 .1111 .1048
    8 1
  • 22 1.00 .0000 .0000
    9 0
  • Number of Cases 9 Censored 0 (
    .00) Events 9

26
Comparing two survival distributions
27
The log-rank test
  • The log-rank test evaluates the null hypothesis
  • H0 Slt40(t) Sgt40(t) versus the alternative
  • H0 Slt40(t) ? Sgt40(t)
  • the test is based on the statistic
  • where, for each failure time j and group i1,2,
  • , where dj is the number of deaths, Y(t)
    is the number at risk (alive) at time t and Y1(t)
    and Y2(t) are the total number at risk in group 1
    and 2 respectively and
  • and are the
    total numbers of expected and observed deaths.

28
The log-rank test with SPSS
  • The SPSS output for the log-rank test is as
    follows
  • since p0.006lt0.05, there is a statistically
    significant difference in survival between the
    two groups.
  • Since the log-rank test is two-sided, we must
    check the median survival time to see the
    direction of the difference (here it is the
    younger lt40 year-old patients).

Test Statistics for Equality of Survival
Distributions for age Statistic
df Significance Log Rank
7.61 1 .0058
Write a Comment
User Comments (0)
About PowerShow.com