Title: Confidence Intervals
1Confidence Intervals
Clinical Trials in 20 Hours
- Elizabeth S. Garrett
- esg_at_jhu.edu
- Oncology Biostatistics
- March 27, 2002
2What is a confidence interval?
- It is an interval that tells the precision with
which we have estimated a sample statistic. - Examples
- parameter of interest progression-free survival
time - The 95 confidence interval on progression-free
survival is 13 to 26 weeks. - parameter of interest response rate
- The 95 confidence interval on response rate is
0.20 to 0.40. - Parameter of interest change in CD34 cells
- The 95 confidence interval for CD34 cells is
0.2 to 0.4.
3Different Interpretations of the 95 confidence
interval
- We are 95 sure that the TRUE parameter value is
in the 95 confidence interval - If we repeated the experiment many many times,
95 of the time the TRUE parameter value would be
in the interval - Before performing the experiment, the
probability that the interval would contain the
true parameter value was 0.95.
4Example
- Leisha Emens, M.J. Kennedy, John H. Fetting,
Nancy E. Davidson, Elizabeth Garrett, Deborah A.
Armstrong - A phase 1 toxicity and feasibility trial of
sequential dose dense induction chemotherapy with
doxorubicin, paclitaxel, and 5-fluorouracil
followed by high dose consolidation for high risk
primary breast cancer - 83 patients underwent leukopheresis for
peripheral blood stem cell collection after
conventional dose adjuvant therapy, and 14
patients underwent the procedure on the dose
dense adjuvant protocol 9626. - Results Compared to the standard dose
doxorubicin-containing adjuvant therapy, the dose
dense regimen decreased CD34 peripheral blood
stem cell (PBSC) yields, requiring that 50
patients have a supplemental bone marrow harvest. - Question What can we say about how CD34
peripheral blood stem cell yields in each of the
two groups? -
5Example
- CD34 PBSC in trial 9601 and 9626.
- We can estimate the mean CD34 PBSC in each
trial - 0.40 in the standard group
- 0.30 in the dose-dense group.
- We can conclude
- We estimate that CD34 PBSC in the standard
group is 0.40 and in the dose dense group is
0.30. - But, how sure are we about those estimates?
6Quantifying Uncertainty
- Standard deviation measures the variation of a
variable in the population. - The standard deviation of CD34 PBSC in the
standard group is 0.27 and is 0.20 in the dose
dense group. - Technically,
7For normally distributed variables.
68 of individuals values fall between ?1
standard deviation of the mean
s
68
8For normally distributed variables.
95 of individuals values fall between ?1.96
standard deviations of the mean
1.96s
95
9Standard deviation versus standard error
- The standard deviation (s) describes variability
between individuals in a population. - The standard error describes variation of a
sample statistic. - Example We are interested in the mean CD34
PBSC. (We notate the mean by x). - The standard deviation (0.27 in standard and 0.20
in dose dense) describes how individuals differ. - The standard error of the mean describes the
precision with which can make inference about the
true mean.
10Standard error of the mean
- Standard error of the mean (sem)
- Comments
- n sample size
- even for large s, if n is large, we can get good
precision for sem - always smaller than standard deviation (s)
11Example
- In standard group, s 0.27 and n 83
- In dose dense group, s 0.20 and n 14
12Sampling Distribution
- The sampling distribution of a sample statistic
refers to what the distribution of the statistic
would look like if we chose a large number of
samples from the same population
Mean 3 s 2.45
The sample statistic of interest to us is the
mean.
13Sampling Distribution of the Mean
- By the Central Limit Theorem, it is true that
even if a variable is NOT normally distributed,
for large sample size, the sampling distribution
of the mean is normally distributed.
Mean 3 s 2.45
14Sampling Distributions
sem 0.47
sem 0.23
sem 0.10
sem 0.47
15Central Limit Theorem Main Ideas
- The sampling distribution of a sample statistic
is often normally distributed - The mathematical result comes from the Central
Limit Theorem. For the theorem to work, n should
be large. - Statisticians have derived formulas to calculate
the standard deviation of the sampling
distribution and it is called the standard error
of the statistic
16Sampling Distribution of the Mean
- In general for large n, means have a normal
distribution. - It is true that 95 of sample means will be
within ?1.96 of the true mean, ?.
The 95 confidence interval for the mean
17General formula for 95 confidence interval
- Notes
- sample size must be sufficiently large for
non-normal variables. - how large is large? depends on skewness of
variable - VERY often people use 2 instead of 1.96.
18Example
- In the standard group, the mean was 0.40, s
0.27, and n 83 - In the dose dense group, the mean was 0.30, s
0.20, and n 14
19Not only 95.
- 90 confidence interval
- NARROWER than 95
- 99 confidence interval
- WIDER than 95
20But why do we always see 95 CIs?
- Duality between confidence intervals and
pvalues - Example Assume that we are testing that for a
significant change in QOL due to an intervention,
where QOL is measured on a scale from 0 to 50. - 95 confidence interval (-2, 13)
- pvalue 0.07
- It is true that if the 95 confidence interval
overlaps 0, then a t-test testing that the
treatment effect is 0 will be insignificant at
the alpha 0.05 level. - It is true that if the 95 confidence interval
does not overlap 0, then a t-test testing that
the treatment effect is 0 will be significant at
the alpha 0.05 level.
21Other Confidence Intervals
- Differences in means
- Response rates
- Differences in response rates
- Hazard ratios
- median survival
- difference in median survival
- ..
22Difference in Means
- Example What is the 95 confidence interval for
the difference in CD34 PBSCs in the two trials?
2395 Confidence Intervals for Proportions
- Socinski et al., Phase III Trial Comparing a
Defined Duration of Therapy versus Continuous
Therapy Followed by Second-Line Therapy in
Advanced-Stage IIIB/IV Non-Small-Cell Lung Cancer
JCO, March 1, 2002. - Patients and Methods Arm A (4 cycles of
carboplatin at an AUC of 6 and paclitaxel), Arm B
(continuous treatment with carboplatin/
paclitaxel until progression). At progression,
patients from each arm receive second-line weekly
paclitaxel at 80mg/m2/week. - Results 230 Patients were randomized (114 in
arm A and 116 in Arm B). Overall response rates
were 22 and 24 for arms A and B. Grade 2 to 4
neuropathy was seen in 14 and 27 of Arm A and B
patients, respectively.
2495 Confidence Intervals for Proportions
- What are 95 confidence intervals for the
response rates in the two arms? - standard error of a sample proportion is
- An equation for confidence interval for a
proportion - Note this is an approximation based on the
central limit theorem! Using statistical
programs, you can get exact confidence
intervals. - Assumptions
- n is reasonably large
- p is not too close to 0 or 1
- rule of thumb pn gt 5
25Example Response Rate to Treatment
26Example Grade 2 to 4 Neuropathy
2795 Confidence Interval for Difference in
Proportions
- What is the 95 confidence interval for the
difference in rates of neuropathy in arms A and
B?
28Recap
- 95 confidence intervals are used to quantify
certainty about parameters of interest. - Confidence intervals can be constructed for any
parameter of interest (we have just looked at
some common ones). - The general formulas shown here rely on the
central limit theorem - You can choose level of confidence (does not have
to be 95). - Confidence intervals are often preferable to
pvalues because they give a reasonable range of
values for a parameter.
29Some Confidence Intervals in Survival Analysis
Example Urba et al. Randomized Trial of
Preoperative Chemoradiation Versus Surgery Alone
in Patients with Locoregional Esophageal
Carcinoma, JCO, Jan 15, 2001.
- Hazard Ratio 95 CI
- Chemo v. surgery 0.69 0.46-1.06
- Arm 1 Arm II
- 95CI 95CI
- 1 year survival 58 46-73 72 58-84
- 3 year survival 16 8-30 30 20-46
- What about the confidence interval for the 1 year
and 3 year difference?
30 - Why not provide confidence intervals for...
- Difference in median survival
- Difference in 1 year survival
- Difference in 3 year survival
- Would give readers a reasonable range of values
to consider for treatment effect that are
intuitive. - What is remembered?
- P 0.09 which means insignificant result
- But, can anyone remember the treatment effect?
31Confidence Intervals for Reporting Results of
Clinical Trials, Simon
- Hypothesis tests are sometimes overused and
their results misinterpreted. - Confidence intervals are of more than
philosophical interest, because their broader use
would help eliminate misinterpretations of
published results. - Frequently, a significance level or pvalue is
reduced to a significance test by saying that
if the level is greater than 0.05, then the
difference is not significant and the null
hypothesis is not rejected.The distinction
between statistical significance and clinical
significance should not be confused.
32Caveats
- They should not be interpreted as reflecting the
absence of a clinically important difference in
true response probabilities.
33Excellent References on Use of Confidence
Intervals in Clinical Trials
- Richard Simon, Confidence Intervals for
Reporting Results of Clinical Trials, Annals of
Internal Medicine, v.105, 1986, 429-435. - Leonard Braitman, Confidence Intervals Extract
Clinically Useful Information from the Data,
Annals of Internal Medicine, v. 108, 1988,
296-298. - Leonard Braitman, Confidence Intervals Assess
Both Clinical and Statistical Significance,
Annals of Internal Medicine, v. 114, 1991,
515-517.