Title: EPI-820 Evidence-Based Medicine
1EPI-820 Evidence-Based Medicine
- LECTURE 9 Meta-Analysis I
- Mat Reeves BVSc, PhD
2Objectives
- Understand the rationale for quantitative
synthesis - Describe the steps in performing a meta-analysis
- identification, selection, abstraction, and
analysis. - Know the appropriate analytic approach for
meta-analysis of key study designs - Experimental (RCTs)
- Observational (cohort, case-control, diagnostic
tests) - Other issues
- Publication bias
- Quality assessment
- Random versus fixed effects models
- Meta-regression
3Background
- Facts
- For most clinical problems/public health issues
there is an overwhelming amount of existing
information, as well as new information produced
every year - However, much of this information
- isn't very good ( poor quality)
- is derived from different methods definitions
( poor standardization) - is often contradictory ( heterogeneity)
- Very few single studies resolve an issue
unequivocally (.. a home run study) - So how should we go about summarizing medical
information?
4How do we summarize medical information?
- Traditional Approach
- Expert Opinion
- Narrative review articles
- Validity? Unbiased? Reproducible?
- Methods? (one study one vote?)
- Consensus statements (group expert opinion)
- New Approach (Meta-analysis)
- Explicit quantitative synthesis of ALL the
evidence
5Definition - Meta-analysis
- A technique for quantitatively combining the
results of previous studies to - Generate a summary estimate of effect, OR
- Identify and explain heterogeneity
- Alternate definition a study of studies, to
help guide further research and identify reasons
for heterogeneity between studies. - Overview or Synthesis
6Overview
- Initially developed in social sciences in
mid-1960s - Adapted to medical studies in early 1980s
- Initially applied to RCTs esp. when indv.
studies were small and under-powered - Also applied to observational epidemiologic
studies often with little fore-thought which
generated much controversy - Explosion in the number of published
meta-analyses in the last 10-15 years.
7Overview
- Often the initial step of a cost-effectiveness
analysis, decision analysis, or grant application
(esp. for RCTs). - Are much cheaper than a big RCT!!!
- Usually correspond to later randomized trials,
but not always (from LeLorier, 1997)
8Discrepancies between meta-analyses and
subsequent large RCTs(LeLorier NEJM 1997)
Results of RCTs Results of RCTs
Results of meta-analyses Positive Negative
Positive 13 6
Negative 7 14
27/40 (68) agreement
9When is a meta-analysis appropriate?
- When several studies are known to exist
- When studies disagree ( heterogeneity) resulting
in a lack of consensus - When both exposures and outcomes are quantified
and presented in a useable format. - When existing individual studies are
under-powered - M-A could then produce a precise estimate of
effect - When you want to identify reasons for
heterogeneity - M-A could illustrate why and identify important
sub-group differences - When no one else has done it (yet!), or an update
of an existing meta-analysis is justified.
10Before you begin plan
- M-As appear easy to do but require careful
planning, and adequate resources (time ) - Need to develop study protocol
- Specify primary and secondary objectives
- Methods
- Describe search strategy (sources, published
studies only?, fugitive lit?, blinding?,
reliability checks?) - Define eligibility criteria
- Type of quality assessment (if any)
- Analysis
- Type of model (fixed vs random, use of quality
scores?) - Subgroup analyses?
- Sensitivity analysis?
11Estimating Time Required to do a M-A
- Meta-Works (Boston, MA), private company
- Provided estimates based on 37 M-As
- Size of the body of literature, quality,
complexity, reviewer pool and support services
all important - Aver. total hrs per study 1139 (range 216
2516) - Search, selection, abstraction 588 hrs
- Stat Analysis 144 hrs
- Write up 206 hrs
- Other tasks 201 hrs
- Size of body of literature before any deletions
(x) is best single guide (Hrs 721 0.243x
0.0000123x2)
12Steps in a meta-analysis
- 1. Identification (Search)
- 2. Selection
- 3. Abstraction
- 4. Analysis
- 5. Write-up
131. Identification - Sources
- M-As use systematic, explicit search procedures
(cf. qualitative literature review) - MEDLINE
- 4100 journals
- 1966 - present
- Web search at PubMed http//www.ncbi.nlm.nih.gov/
PubMed - other search engines BRS Colleague, WinSPIRs,
etc - EMBASE
- similar to MEDLINE, European version
- Expensive, not widely available in US
14Identification - Sources
- Cochrane Collaboration Controlled Trials Register
- Over 160,000 trials, including abstracts (
translations) - by subscriptions.. MSU Electronic Library
database - includes
- MEDLINE, EMBASE
- non-English publications
- non-indexed publications
- hand-search of journals
- Other MEDLARS
- CancerLit, AIDSLINE, TOXLINE, Dissertation
Abstracts Online - Index Medicus
- important if searching before 1966
- hand-search only
15Identification - Steps
- 1. Search own personal files
- 2. Search electronic databases
- Review titles and on-line abstracts to eliminate
irrelevant - Retrieve remaining articles, review, and
determine if meet inclusion/exclusion criteria - 3. Review reference lists of articles for missed
references - 4. Consult experts/colleagues/companies
- 5. Conduct hand-searches of non-electronic
databases and/or relevant journals - 6. Consider consulting an expert (medical
librarian) with training in MEDLINE and use of
MeSH terms. -
16Limitations of electronic databases
- Electronic resources have been essential for
growth of M-A, but they are far from perfect - 1. Databases are incomplete
- Medline contains only 1/3rd of all biomed
journals - 2. Indexing is never perfect
- Want search to have high Se (include all relevant
studies) and high Sp (but exclude the
irrelevant!) - Ratio of retrieved articles relevant articles
can vary widely
17Limitations of electronic databases
- 2. Indexing is never perfect
- Accuracy of indexing per se relies on
- authors understanding how studies are categorized
- database assigning correct category to study
- Indexing also depends on ability of search
strategies (e.g., MeSH) to identify relevant
articles
18Limitations of electronic databases
3. Search Strategies are never perfect - Its hard
to find all the relevant studies - Average Se of
expert searchers using MEDLINE (vs known
Registries of studies) 0.51 Example National
Perinatal RCT Registry
Topic Perinatal RCT Registry MEDLINE (Expert searcher) MEDLINE (Amateur searcher)
Neonatal hyperbilirubinemia 88 28 17
Intraventricular hemorrhage 29 19 11
19Other search issues
- Non-English Studies
- MEDLINE
- Translation of title usually provided but
abstracts often not. But N.B. that many
non-English journals are not included anyway! - No a priori justification for excluding
non-English studies - Quality is often equivalent or even better!
- Excluding non-English studies can effect
conclusions - But including means you need a translation just
to determine eligibility!
20Fugitive Literature
- unpublished studies ( why are they unpublished?)
- dissertations
- drug company studies
- book chapters
- non-indexed studies and abstracts
- conference proceedings
- government reports
- pre-MEDLINE (1966)
- Sometimes important sources of information
- Hard to track down contact experts/colleagues
- Need to decide whether to include or not -
general consensus is that you should.
21Publication bias
- Published studies are not representative of all
studies that have been performed - Articles with positive findings (P lt 0.05) are
more likely to be published - Hence published studies are a biased sub-set
- Publication bias systematic error of M-A that
results from using only published studies
22Evidence of Publication Bias
Easterbrook (1991) 285 analyzed studies reviewed
by Oxford Ethics Committee 1984-87
Study Status N (P lt 0.05)
Published 138 67
Presented Only 69 55
Neither 78 29
Total 285 54
23Implications of Publication Bias
Simes (1986) Chemotherapy for Advanced Ovarian
CA Comparison of Published Trials vs Registered
Trials
Results Published Registered
N 16 13
Median Survival Ratio 1.16 1.06
95 CI 1.06 1.27 0.97 1.15
P value 0.02 0.24
24Publication Bias
- Probably results from a combination of author and
editor practices and decisions (Ioannidis, 98) - Emphasizes the importance of registries of trials
(N.B. Similar registries of observational studies
are probably not feasible, although in Social
Sciences Campbell Collaboration is attempting to
do this) - Simple Solution
- Dont base publication decisions on statistically
significance! - Focus on interval estimation.
- Yeah right!
25Publication bias Approaches
- 1. Attempt to Retrieve all Studies
- Required for Cochrane Publications
- Difficult to identify unpublished studies and
then to find out details about them - Worst Case Adjustment
- Number of unpublished negative studies to negate
a positive meta-analysis - X N x (ES) / 1.6452 - N
- where N number of studies in meta-analysis,
- ES effect size
- Example
- If N 25, and ES 0.6 then X 58.2
- Almost 60 unpublished negative studies would be
required to negate the meta-analysis of 25
studies.
262. Graphical Approaches - Funnel plot
Missing studies small effects size with
negative findings
X
X
X
Sample Size (precision)
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
Effect Size
272. Selection
- Inclusion/eligibility criteria essential to
- Produce a more focused (valid) study
- Ensure reproducibility and minimize bias
- Apply criteria systematically and rigorously
- Balance between highly restrictive versus
non-restrictive criteria in terms of - face validity, homogeneity, power (N),
generalizability - Always develop in advance and include clinical
expert(s) in the team
28Typical inclusion criteria
- study design (e.g., RCTs?, DBPC?, Cohort CCS?)
- setting (emergency department, outpatient,
inpatient) - age (adults only, gt 60 only, etc)
- year of publication or conduct (esp. if
technology or typical dosing changes) - similarity of exposure or treatment (e.g., drug
class, or dosage) - similarity of outcomes (case definitions)
- minimum sample size or follow-up
- languages?
- complete vs incomplete (abstracts)
- published vs fugitive?
- pre-1966?
29Selection Other Issues
- multiple publications from same study?
- Include only one! (double dipping is common!)
- report should provide enough information for
analysis (i.e. point estimate and variability
SD or SE) - Selection process should be done independently by
at least 2 reviewers - Measure agreement (K) and resolve discrepancies
- Document excluded studies and reasons for
exclusion - Keep pertinent but excluded studies
30Typical Searching and Selection Results
- First pass, using title in computer search 300
- 500 articles - Second pass, using abstract in computer search
60 - 100 articles - Final pass, using copy of entire article 30 -
60 articles - Included in study 30 articles
313. Abstraction
- Goal to abstract reliable, valid and bias free
information from all written sources - Should expect a degree of unreliability
- intra- and inter- rater reliability is rarely if
ever 100!! - Many sources of potential error
- Article may be wrong due to typographical or
copyediting errors - Reported results can be misinterpreted
- Errors in data entry during abstraction process
32Abstraction
- Ways to minimize error
- Develop and pilot test abstraction forms
- Develop definitions, abstraction instructions,
and rules - Train abstractors, pilot test, get feedback, and
refine - Abstraction Forms
- Number each data item
- Require a response for EVERY item
- Distinguish between negative, missing, and
not-applicable - Simple instructions/language
- Clear skip and stop instructions
- Items clearly linked to definitions and
abstraction rules
33Abstraction
- Typical process
- 2 independent reviewers
- Practice with 2 or 3 articles to calibrate
- Use a 3rd reviewer or consensus meeting to
resolve conflicts - Measure agreement (K) and resolve discrepancies
34Other Issues - Abstraction
- Outcome measures of interest may have to be
calculated from original data - For example, data to calculate relative risk may
be present but not described as such. - Multiple estimates from same study?
- Exp intention-to-treat vs not, adjusted for
loss-to-follow up - Obs crude vs age-adjusted vs multiple adjusted
(model) - Include only one estimate per study, avoid
over-fitted model estimates (as often more
imprecise)
35Investigator Bias
- Abstractor may be biased in favor of (or
against!) a particular outcome (positive or
negative finding), or researcher/institution, or
journal. - prominent journals may be given greater weight or
authority (rightly or wrongly) - if this may be an issue, have research assistant
eliminate identifiers from articles ( blind
review)
36Blind Review
- Remove study information that could affect
inclusion or quality of abstraction, like - author, title, journal, institution, country
- Berlin (97)
- compared blinded vs non-blinded reviews
- Found discrepancy in which studies to include but
little difference in summary effect sizes - Time consuming
- Probably can avoid esp. if use well defined
abstraction procedures
37Assessment of study quality
- Quality is an implicit measure of validity
- Poor quality studies have lower validity
- Using quality scoring should theoretically
improve the validity of M-As - Process
- Develop criteria (how?)
- Develop scale ( scoring system)
- Abstract information and score each study
- Example RCT scoring systems
- Chalmers (1981) 36 item scale! (see HWK 5)
- Jadad (1997) 5 point scale
38Jadad Criteria for Scoring RCTs (1997 Cont Clin
Trials 171-12)
- 1. Randomization
- Appropriate ( 1 point) if each patient had equal
chance of receiving intervention and
investigators could not predict - Add 1 point if mechanism described and
appropriate - Deduct 1 point if mechanism described and
inappropriate - 2. Double blinding
- Appropriate ( 1 point) if stated that neither
the patient nor investigators could identify
intervention, or if active placebo, identical
placebo or dummies mentioned - Add 1 point if method described and appropriate
- Deduct 1 point if mechanism described and
inappropriate - 3. Withdrawals and dropouts
- Appropriate ( 1 point) if number and reasons for
loss-to-FU in each group described.
39Uses of Quality Scores
- Threshold (minimum score for inclusion)
- Categorize study quality
- High, medium, low quality
- Use as sub-group analyses
- Sensitivity analysis
- Combine study-specific scores with variance
(based on N) to generate modified weights - Poorer studies count less
- Generally not recommended
- Meta-regression
40Other Issues Quality Scoring
- Quality is difficult to measure
- No consensus on method of scale development not
even for RCTs - Few reliability/validity studies of scoring
systems - inter-rater reliability of quality assessment
often poor - Relies on quality of the reporting itself
- sometimes study is blinded or randomized, but if
not explicitly stated then it suffers in quality
assessment - Difficult to detect bias from publications
- More recent studies score higher partly because
they conform to recent standardized reporting
protocols (e.g., RCTs CONSORT)