Title: Statistical Issues in Interpreting Clinical Trials
1Statistical Issues in Interpreting Clinical Trials
- D. L. DeMets
- Journal of Internal Medicine
- 255 529-537. 2004
Lies, Damn Lies, and Clinical Statistics Justin
L. Grobe September 1, 2004
2Drug Development Paradigm
- Medicinal Chemistry
- Targeted development of new compounds
- Animal Testing
- Test efficacy, potency, safety
- Human Clinical Trials
- Multiple phases to test efficacy, potency,
safety, and to compare new intervention to
standard
3Clinical Trials Design Paradigm
- Randomization
- Assignment to treatment group
- Order effects
- Placebo
- Control
- (Ethical considerations)
4Input Return
- No clever analysis can rescue a flawed design or
poorly conducted trial. - Compliance issues
5Five Major Statistical Issues
- Intention-to-treat principle
- Surrogate outcome measures
- Subgroup analyses
- Missing data
- Noninferiority trials
6Statistical Issue 1Intention-to-treat principle
- all patients are accounted for in the primary
analysis, and primary events observed during the
follow-up period are to be accounted for as
well. - Results can be biased if either of these aspects
are not adhered to
7Myths and examples
- Myth Large trials are free of these concerns
- Increased numbers of patients decreases
variability of response variable, thereby making
detection of differences easier - EXCEPT, this amplifies biases in the outcome
measurement - WHICH MAY cause detection of differences which do
not actually exist
8To include or not to include
- Two common reasons to drop patient data
- Post hoc ineligibility assessment
- Lack of patient compliance
9TABLE 1 Post-hoc ineligibility
assessmentAnturane Reinfarction Trial
- 1629 patients who had survived a heart attack
- 813 patients received Anturane
- 816 patients received placebo
- 71 patients deemed ineligible for analysis by
protocol
Table 1 1980 Anturane mortality results
Anturane () Placebo () P-value
Randomized 74/813 (9.1) 89/816 (10.9) 0.20
Eligible 64/775 (8.3) 85/783 (10.9)
0.07 Ineligible 10/38 (26.3) 4/33 (12.1)
0.12 P-values for eligible 0.0001 0.92
versus ineligible
Striking statistical comparisons are made by
including/excluding patients in each group thus
the results are biased by post hoc exclusions
10TABLE 2 Patient complianceCoronary Drug Project
- 3885 post-heart attack men were given clofibrate
or placebo - 708 clofibrate and 1813 placebo patients were at
least 80 compliant
Table 2 Coronary drug project 5-year mortality
Clofibrate Placebo n
Deaths n Deaths Total (as reported) 1103
20.0 2782 20.9 By compliance 1065
18.2 2695 19.4 lt80 357 24.6
882 28.2 gt80 708 15.0 1813
15.1
Compliance itself is considered an outcome thus
to base the interpretation of the drug outcome
on the compliance outcome is confounding
11Dealing with noncompliance
- Larger sample sizes are required to compensate
for the dilution effect of noncompliance - 10 noncompliance requires 23 increase in sample
size - 20 noncompliance requires a 56 increase in
sample size
12Statistical Issue 2Surrogate outcome measures
- Outcome measures of primary question must be
- Clinically relevant
- Sensitive to intervention
- Ascertainable in all patients
- Resistant to bias
- Result Large, time-consuming, costly studies
- Alternative approach surrogate outcome measures
13Surrogate outcome measureAssumption
- If the intervention will modify surrogate
outcome, it will modify the primary clinical
outcome
14Surrogate outcome measureRequirements
- Surrogate outcome must be predictive of clinical
outcome - Surrogate outcome must fully capture the total
effect of the intervention on the clinical outcome
Necessary and sufficient
15Surrogate outcome measuresDifficult to obtain
and validate
- Intervention may modify the surrogate and have no
or only partial effect on the clinical outcome - Intervention may modify the clinical outcome
without affecting the surrogate
(Note NOT surprisingly, track record for use of
surrogate outcome measures is very bad)
16Surrogate outcome measuresExampleCardiac
Arrhythmia Suppresion Trial (CAST)
- Three drugs tested for suppression of cardiac
arrhythmias - All three drugs had been shown to suppress
premature cardiac ventricular contractions
(surrogate) - Two drugs terminated early (10-15 into study)
because both drugs dramatically increased
cause-specific sudden death and total mortality
Table 3 Cardiac Arrhythmia Suppression Trial
Early termination in two drug arms
Drugs Placebo Sudden death 33 9 Total
mortality 56 22
Clearly the interventions (drugs) had
differential effects on the surrogate measure
(premature cardiac ventricular contractions) and
the clinical outcome (mortality)
17Statistical Issue 3Subgroup analyses
- Clinical trials usually try to include as many
(diverse) patients as possible for multiple
reasons - Large sample size
- Reasonable recruitment time
- Assess internal consistency of results
- Seemingly logical use of the large data set is to
do many post hoc analyses on subgroups
18Subgroup analysisMathematical problems
- Introduction of subgroups increases probability
of false positives - 5 subgroups yields greater than 20 chance of at
least one (p0.05) statistically significant
difference BY CHANCE
19Subgroup analysisMERIT trial
- Beta-blocker (metoprolol) treatment for patients
with congestive heart failure - Showed a 34 reduction in mortality overall
20Subgroup analysisMERIT trial
Consistency of mortality results across lots of
subgroups found with subgroup analysis
21Subgroup analysisMERIT trial
In the USA, total mortality is not reduced, yet
total mortality plus any hospitalization is?
22Subgroup analysisMERIT trial
- Two other similar heart failure trials evaluating
other beta-blockers showed no regional
difference - THUS, it is likely that the MERIT finding is due
to chance alone.
23Subgroup analysisPRAISE-I and PRAISE-II trials
- PRAISE-I performed to evaluate amlodipine for the
treatment of congestive heart failure - Subgroups
- Ischemia
- Nonischemia
- Analysis of subgroups separately showed a
significant (plt0.001) effect of amlodipine on
heart failure in nonischemic patients, but no
effect on ischemic patients - Researchers decided to perform PRAISE-II trial on
nonischemic patients only
24Subgroup analysisPRAISE-I and PRAISE-II trials
- PRAISE-II showed remarkably similar mortality
results in the drug and placebo groups - PRAISE-II directly opposed the exciting results
of PRAISE-Is subgroup analysis
25Statistical Issue 4Missing data
- Missing data is often simply dropped
- This violates two rules
- Intention-to-treat rule ? all patients must be
accounted for in primary outcome analysis - Common sense rule ? if patient is too sick to
complete trial, this may be informative!
26Missing data
- In time to event trials (like mortality), data
can be missing because the study ends before the
event happens - Patients are then censored (dropped)
- This can introduce serious mathematical bias
- (Mortality studies in USA have no excuse ? death
indices allow follow-up without help from patient)
27Statistical Issue 5Noninferiority trials
- New intervention is not worse than the standard
- New intervention may be
- Easier to administer
- Better tolerated
- Less toxic
- Less expensive
- Any given study may be a superiority and/or
noninferiority trial, depending on results
28Noninferiority trials
29Noninferiority trials
- Three challenges must be met
- Noninferiority trial must be of highest quality
to detect clinically meaningful differences - Noninferiority trial must have a strong,
effective control intervention (state-of-the-art
care) - Margin of indifference is arbitrary, depending on
medical importance of treatment and
risk-to-benefit tradeoffs
30Noninferiority trialsOPTIMAAL Trial
- Losartan (angiotensin II receptor blocker) vs
captopril (ACE inhibitor) in heart failure
patient population - Losartan has fewer (and less severe) side effects
than captopril - OPTIMAAL
- Designed to detect 20 reduction in relative
risk, with 95 power - Margin of indifference set at 1.1
- Thus 95 confidence interval needed to exclude
risk of 1.1 to declare losartan noninferior to
captopril
31Noninferiority trialsOPTIMAAL Trial
- Mortality results for OPTIMAAL
- Relative risk of 1.126 with 95 confidence
interval of 1.28 - NEITHER superiority nor noninferiority were
achieved - Researchers computed that captopril had
(historical data) a relative risk of 0.806 vs.
placebo, and thus calculated that losartan must
therefore have a relative risk of 0.906 vs.
placebo - The statistically appropriate conclusion at this
point is - NO ACCEPTABLE CONCLUSIONS POSSIBLE FROM THIS DATA
32CONCLUSIONS
- Statistics can not make up for bad design
- Statistics can not make up for poor execution of
design - Statistics is very limited in being able to
compensate for - Ineligible patients being enrolled
- Noncompliance
- Unreliable outcome measures
- Missing data
- Underpowered trials