Title: Introduction%20to%20Frailty%20Models
1Introduction to Frailty Models
- STA635 Project by Benjamin Hall
2Cox proportional hazards model
- Model hyi (t) h0(t)e?1X1... ?kXk is the
hazard function of the ith individual - Assumption The hazard function for each
individual is proportional to the basine hazard,
h0(t). This assumption implies that the hazard
function is fully determined by the covariate
vector. - Problem There may be unobserved covariates that
cause this assumption to be violated.
3Simluation Ex Unobserved covariate
- Consider a situation with the following
population - Obviously Drug A is effective for the entire
population. - But what happens in the Cox Model if the group is
unobservable?
Group Proportion of Population Hazard Rate with placebo Hazard Rate with Drug A
1 40 1 .5
2 40 2 1
3 20 10 8
4Simulation Example, continued
- Lets simulate this example and apply the Cox
model - Have R simulate 100 people according to the
previous tables probabilities and randomly
assign them to treatment or placebo - For each person, have R simulate of incidents
within period of length 1. At the end of period
of length 1 right-censoring occurs.
5Simulation Example, continued
- Here is some of the data generated by R (see
final slide for code)
id group treat time status
1, 1 3 0 .38 1
2, 1 3 0 .07 1
3, 1 3 0 .23 1
4, 1 3 0 .15 1
5, 1 3 0 .11 1
6, 1 3 0 .89 0
7, 2 3 1 .22 1
8, 2 3 1 .04 1
... ... ... ... ... ...
6Simulation Example, continued
- Now we run coxph on our data
- gt myfit1
- Call
- coxph(formula Surv(time, status) treat)
- coef exp(coef) se(coef)
z p - treat -0.128 0.88 0.11
-1.17 0.24 - Likelihood ratio test1.37 on 1 df, p0.242 n
445 - Notice that the LRT has a p-value of .242 which
is not significant. But we know that treatment is
effective for everyone. What is happening?
7Simulation Example, continued
- The problem is that we have heterogeneity in the
data due to the unobservable groups. - Since we cannot include group in our model, the
assumption of proportional hazards is violated. - What can we do to solve this problem? Use a
frailty model.
8Frailty Model
- Frailty models can help explain the unaccounted
for heterogeneity. - Frailty Model hyi (t) z ? h0(t)e?1X1... ?kXk
is the hazard function of the ith individual - The distribution of z is specified to be, say,
Gamma. (Note z must be non-negative since the
hazard is non-negative.) - In this situation, the shared frailty model is
appropriate, that is multiple observations of the
same individual always has the same value of z.
9Frailty Model in R
- Lets apply the frailty model to our simulated
data - gt myfit2
- Call
- coxph(formula Surv(time, status) treat
frailty(id)) - coef se(coef)
se2 Chisq DF p - treat -0.147 0.160
0.111 0.85 1.0 3.6e-01 - frailty(id)
93.89 43.5 1.4e-05 - Iterations 5 outer, 17 Newton-Raphson
- Variance of random effect 0.294
I-likelihood -1887.9 - Degrees of freedom for terms 0.5 43.5
- Likelihood ratio test117 on 44.0 df, p1.37e-08
n 445 - Notice that the LRT now has a highly significant
p-value.
10Frailty Model in R
- Now lets try implementing the frailty model to a
real data set, the kidnet data set. - Here are the results for the regular Cox Model
- gt kfit1
- Call
- coxph(formula Surv(time, status) age sex,
data kidney) - coef exp(coef) se(coef)
z p - age 0.00203 1.002 0.00925 0.220
0.8300 - sex -0.82931 0.436 0.29895 -2.774
0.0055 - Likelihood ratio test7.12 on 2 df, p0.0285 n
76 - Here the LRT is significant with a p-value of
.0285 even without considering frailty.
11Frailty Model in R
- However, a frailty model seems applicable in this
situation since their are multiple oberservations
(i.e. 2 kidneys) per person. Below considers
frailty - gt kfit2
- Call
- coxph(formula Surv(time, status) age sex
frailty(id), data kidney) - coef se(coef) se2
Chisq DF p - age 0.00525 0.0119 0.0088 0.2
1 0.66000 - sex -1.58749 0.4606 0.3520 11.9
1 0.00057 - frailty(id)
23.1 13 0.04000 - Iterations 7 outer, 49 Newton-Raphson
- Variance of random effect 0.412
I-likelihood -181.6 - Degrees of freedom for terms 0.5 0.6 13.0
- Likelihood ratio test46.8 on 14.1 df,
p2.31e-05 n 76 - Now the LRT is even more significant.
12Resources
- Therneau and Grambsch, Modeling Survival Data,
Chapter 9 - Wienke, Andreas, Frailty Models,
http//www.demogr.mpg.de/papers/working/wp-2003-03
2.pdf - Govindarajulu, Frailty Models and Other Survival
Models, www.ms.uky.edu/statinfo/nonparconf/govin
darajulu.ppt
13R Code
- library(survival)
- GEN_TIME
- gen_time lt- function(group, treat)
- if (group 1)
- return (round(rexp(1, 1-(.5treat)),2))
- if (group 2)
- return (round(rexp(1, 2-treat),2))
- if (group 3)
- return (round(rexp(1, 10-2treat),2))
- PERSON DATA
- person_data lt- function()
- treat lt- rbinom(1,1,.5)
- x lt- runif(1)
- t1 lt- matrix(NA, nrow1, ncol25)
- if (x lt .4) group lt- 1
- if (x gt .4 x lt .8) group lt- 2
- if (x gt .8) group lt- 3
- elapse lt- 0
m1 lt- matrix (NA, nrow100, ncol25) for (i in
1100) m1i, lt- person_data() samp_size lt-
sum(m1,3) samp lt- matrix(NA, nrow samp_size,
ncol 5) colnames(samp) lt- c("id", "group",
"treat", "time", "status") count2 lt- 1 for (i in
1100) for (j in 1m1i,3) sampcount2, 1
lt- i sampcount2, 2 lt- m1i,1 sampcount2, 3
lt- m1i,2 sampcount2, 4 lt- m1i,j3 sampcount
2, 5 lt- 1 if(jm1i,3) sampcount2, 5 lt- 0
count2 lt- count2 1 myfit1 lt-
coxph(Surv(samp,4, samp,5) samp,3) myfit2
lt- coxph(Surv(samp,4, samp,5) samp,3
frailty(samp,1)) myfit1 myfit2