Title: Endpoints in clinical studies
1Endpoints in clinical studies
- Mark Conaway
- Div of Biostatistics and Epidemiology
2Outline
- Background why discuss endpoints?
- Common theme to observational studies and RCTs
- Importance to clinical research
- Multiple endpoints
- Surrogate endpoints
3Endpoints
- Quantitative measurements required by or implied
by the objectives of the study/trial - Often see distinction between
- hard and soft endpoints or
- objective and subjective endpoints
4Prefer hard endpoints
- hard endpoints clinical landmarks that are
- well-defined in study protocol
- definitive with respect to disease process
- not subjective
- Examples
- death,
- time to disease progression/relapse,
- some laboratory measurements
5Soft endpoints
- Soft endpoints
- not directly related to disease process or
- require subjective assessment by
patient/physician - Examples
- Quality of life questionnaires
- Symptom questionnaires
6Not all endpoints can be classified
- Some endpoints are useful and reliable but
require some subjectivity - Examples
- Pathology
- Imaging
7Why is a discussion of endpoints important?
- Original intent of this lecture was as a
follow-up to - the observational studies lecture (Dr. Bovbjerg)
and - the randomized clinical trial lecture (Dr.
Petroni) - Both are attempts to apply rigorous scientific
principles to study benefits of therapies, so why
do they disagree so often?
8Disagree
- Findings suggested in observational studies are
not the same as those found in the RCT - Usually that means an effect suggested by
observational studies is not supported in the
RCT.
9Many reasons for this..
- Different patient populations?
- Potential confounders not adequately controlled
for in the observational studies? - Design of RCT?
- Choice of endpoints?
- Hobart et al, 2007 raises issue that potentially
useful therapies in neurology have been discarded
because of the RCT endpoints used.
10Example association between periodontal disease
and preterm birth
- Michalowicz et al Treatment of Periodontal
Disease and the Risk of Preterm Birth (NEJM, Nov
2, 2006) - Multi-center RCT of 820 patients, half assigned
to receive periodontal therapy during pregnancy,
half to received usual dental care.
11Supporting evidence for the trial epidemiological
- Offenbacher et al (2006) OCAP study
- 1020 pregnant women, assessed periodontal health
at enrollment (lt 26 wks GA) and postpartum. - Results rates of preterm births (lt 37 wks)
- Healthy 11.2 , Mod-Severe dx 28.6
- Rates of very pre-term births (lt 32 wks)
- Healthy 1.8, Mod-Severe dx 6.4
12Supporting evidence for the trial epidemiological
- Goepfert et al (2004) case control study
- cases 59 women with spontaneous delivery birth
at lt 32 weeks - Controls 44 women with early indicated births
of lt 32 weeks and 36 women with term births - Results
- Cases more likely to have periodontal disease
13Supporting evidence for the trial epidemiological
- Boggess et al (2003)
- Single cohort of 763 births
- Studied rate of preeclampsia (39 women)
- Results Periodontal disease associated with risk
of pre-eclampsia - Many other papers looking at birth weight, Apgar
scores,....
14Other evidence
- Small intervention trials
- Animal studies to assess specific types of
bacteria - Overall, a good deal of evidence to support the
notion that reducing periodontal disease might
reduce rates of problems
15Multi-center RCT (Michalowicz et al)
- Results Gestational age at delivery
- No difference between the groups
- Why?
- Different patient populations?
- Treatment effective enough?
- Was this the right endpoint?
16What other endpoints?
- Rate of pre-term birth
- lt 37 weeks ? Or lt 35 weeks Or lt 32 weeks?
- Birth weight?
- Gestational age at delivery?
- Stillbirths?
- Apgar scores?
- Admission to NICU?
17The study also considered other possible endpoints
- Rate or pre-term birth (lt 37 weeks)
- Birth weight
- Proportion of infants who are small for
gestational age - Apgar scores
- Admissions to NICU
- Stillbirths (5 in treated group, 14 in control
group, p 0.08)
No difference
18One source of discrepancy
- Can see a situation where the observational
studies put forth the best endpoint - Is this a good idea?
- Clinical trials focus on one particular endpoint
as primary - Why?
19In principle
- Have chosen an endpoint
- Clinical relevant
- Have an appropriate sample size
- to have a type I error rate of 5
- sufficient power for a clinically meaningful
difference
20In practice
- Rare that trials use a single endpoint
- Endpoints
- cover clinical events
- symptoms
- physiologic measures
- side effects
- quality of life
21Whats the problem?
- If test each endpoint at the 5 level
- overall chance of finding at least one endpoint
where there is a significant difference is larger
than 5, even if the treatments are identical - Prone to distorted reporting (i.e. pick most
significant) - Good reference Pocock (1997) Controlled Clinical
Trials, p 530-545
22Whats the problem?
- If you lock in on a single endpoint, in either
the RCT or observational studies - Could miss important therapies
- If you allow yourself to look at many endpoints
- Could wind up doing RCTs based on spurious
findings or - Recommending treatments that arent truly
effective
23What to do?
- Have a pre-defined strategy
- Some advocate
- all results pre-written, with results filled in
as trial concludes - Alternative view
- need to be flexible
- need to allow for unexpected findings
- but recognize potential for problems type I
error rate is not 5
24Delineate primary and secondary outcomes
- Many advocate having a single primary endpoint
- drives sample size calculations
- test based on this endpoint has a 5 type I error
rate - All other endpoints are secondary
25Example Michalowicz et al (NEJM, Nov 2, 2006)
- Primary
- Gestational age at end of pregnancy
- Secondary
- Pre-term births (lt 37 weeks), birth weight,
proportion of infants who are small for
gestational age, Apgar scores, admissions to NICU
26Delineate primary and secondary outcomes
- Can be hard to adhere to in practice
- For example, what if primary outcome is not
different among groups, but all secondary
outcomes are?
27Bonferroni procedure
- If you have k endpoints
- Multiply observed p-value by number of endpoints
- For example, with k 8, convert an observed
p-value of 0.01 to 0.08 - Ensures that if Ho is true for all endpoints,
probability of rejecting Ho for at least one
endpoint is less than or equal to ?
28Limitations forBonferroni procedure
- Endpoints tends to be correlated, so this is
conservative - probability of type I error is much smaller than
? - Treats all outcomes as equal in importance
- Can lead to illogical results
- Trial with p-values 0.01, 0.75,0.75,0.75,0.75
significant - Trial with p-values 0.02, 0.02, 0.02, 0.02, 0.02
is not significant
29Limitations forBonferroni procedure
- Procedure reduces power to detect real
differences in specific outcomes, if they exist - Protect type I error at expense of power
- Difficult to apply strictly in many cases
30So whats the answer?One persons opinion...
- Selection of a primary outcome is important
- Need to allow for surprises
- Full disclosure of endpoints, instead of selected
endpoints, can alleviate a lot of the problems - Adjust p-values?
- Whats the goal of the study?
31Surrogate endpoints
- Hesitate to use the term
- Has a specific technical definition
- Issue
- Quicker, less expensive, less clinically relevant
endpoint or - More expensive, clinically definitive endpoint?
32Example
- Treatment for osteoporosis
- Endpoint
- Bone density via DEXA?
- Fracture?
- If fracture, how would this be ascertained?
33Example
- Choice is, for same amount of resources
- more patients with less clinically relevant
outcome (bone density) - Fewer patients with more clinically relevant
outcome (fracture) - Frequently see the quick endpoint in earlier
stage trials.
34Summary
- Choice of endpoints is crucial to the success of
the study either RCT or observational - Issues about
- Which one?
- How many?
- Primary vs secondary?