Title: Information Mastery
1Information Mastery
- Evaluating Articles about Diagnostic Tests
- (Is Bayes Theorem Really Important?)
2Goals
- Whats your bet? Odds and probabilities in
medicine - Distinguishing the technical precision of a test
with the clinical precision of a test - How to use the prevalence of the disease to
improve or worsen a test
3- This is about . . .
- . . . Uncertainty!
4- Physicians can do more to admit the existence of
uncertainty, both to themselves and to their
patients. Although this will undoubtedly be
unsettling, it is honest, and it opens the way
for a more intensive search for ways to reduce
uncertainty. - DAVID M. EDDY, MD, PhD
- Eddy DM. Clinical Decision Making. Chicago
American Medical Association, 1996
5Technical vs. Clinical Precision
- Baby Jeff The case of screening for muscular
dystrophy at HH - Technical Precision of CPK test
- Sensitivity (ability to rule out disease) 100
- Specificity (ability to identify disease) 99.98
- But,
- The prevalence of MD is 1 in 5000 (0.02)
6Does Baby Jeff have M.D.?
- Of 100,000 males, 20 will have M.D.
- (1 in 5,000, or 0.02 prevalence)
- The test will correctly identify all 20 who have
the disease (sensitivity 100)
7Does Baby Jeff have M.D.?
- Of the 99,980 without M.D.
- Specificity 99.98
- 99,980 x 0.9998 99,960 will be negative
- Therefore, false positives 20
8. . . The Rest of the Story
- Therefore,
- Out of 100,000 infants, 20 will be truly positive
and 20 will be false positive - Positive predictive value 50
- The child with a positive screening test only has
a 50/50 chance of actually having MD! - HARM!
9Another Example Lyme Disease
- Antibody assay
- Sensitivity 95 specificity 95
- High Lyme Disease prevalence (20)
- Positive predictive value 83
- Low Lyme Disease prevalence (2)
- Positive predictive value 28
- Brown SL. Role of serology in the diagnosis of
Lyme disease. JAMA 199928262-6.
10Another Example Mammography
- Mammography in women between 40-50 yrs
- If 100,000 women are screened
- 6,034 mammograms will be abnormal
- 5,998 (99.4) will be false-positive
- 36 will actually have breast cancer
- Why? Prevalence 0.04 (including 4 false
negatives) - Hamm RM, Smith SL. The accuracy of patients'
judgments of disease probability and test
sensitivity and specificity J Fam Pract
19984744-52. Kerlikowske K, et al. Likelihood
ratios for modern screening mammography. Risk of
breast cancer based on age and mammographic
interpretation. JAMA 199627639-43.
11Heart disease and Echo results
- Patients at low risk (example yearly physical)
prevalence 10 - Sensitivity 90 specificity 90
- Positive predictive value 50
12And the WINNER!
- The Proteonomic Pattern test for screening for
ovarian cancer - Better than CA125 to identify ovarian cancer
(Petricoin EF, Ardekani AM, Hitt BA, et al. Use
of proteomic patterns in serum to identify
ovarian cancer. Lancet. 2002359572-7.) - Sensitivity 100
- Specificity 95
- Prevalence in women
1 in 2,500
13How many women with a positive test will have
ovarian cancer?
- One out of every 126
- 0.8
- 99.2 of tests will be falsely positive
14THE CLASSIC 2x2 TABLE
TEST
TEST -
15Sensitivity
TEST
TEST -
16Specificity
TEST
TEST -
17Technical vs. Clinical Precision
- Sensitivity
- The percentage of patients with the disease who
have a positive test - Number with positive test/Number with disease
- Positive Predictive Value
- The percentage of patients with a positive test
who have the disease - Number with disease/ Number with positive test
18Technical Precision
- Specificity Remember SpPin
- When a test has a high Specificity, a
Positive test rules IN the disorder. - Sensitivity Remember SnNout
- When a test has a high Sensitivity, a
Negative result rules OUT the disorder.
19Specificity Large holes catch most of the big
fish but let through the small fish (most of the
fish will be the big fish you want SpPin)
20Sensitivity Small holes catch all the big fish
and many small fish. (If there are not big fish
in the net, they probably arent out there
SnNout)
21The Yin Yangof Sensitivity and Specificity
- Benefit
- Sensitivity and specificity are unaffected by
prevalence of disease - Detriment
- Sensitivity and specificity are unaffected by
prevalence of disease
22Predictive Values
- Positive Predictive Value
- The proportion of patients with a positive test
who have the disease - Negative Predictive Value
- The proportion of patients with a negative test
who dont have the disease. - Predictive values are affected by prevalence
23Positive Predictive Value
TEST
TEST -
24Negative Predictive Value
TEST
TEST -
25Putting it all together
TEST
TEST -
26Likelihood Ratios
- Similar to the concepts of ruling in and
ruling out disease - Pre Test Odds x LR Post Test Odds
- The problem we dont think in terms of odds
27Likelihood Ratios
- Allow many levels of interpretation for a test
results - LR Meaning
- 10 Strong evidence to rule in a disease
- 5-10 Moderate evidence to rule in
- 0.5-2 Indeterminate
- 0.2-0.5 Weak evidence to rule out
- 0.1-0.2 Moderate evidence to rule out
-
- However, even a high LR test can be misleading if
the disease has a low prevalence - CPK testing in newborns LR 5000
28So, . . .the importance of Bayes Theorem
- At low prevalence (e.g. screening, primary care),
even great tests can have significant false
positives - At high prevalence (confirmatory testing), great
tests can have significant false negatives,
leading to confusion - Hazards of inappropriate testing/diagnosis
Remember Baby Jeff - The clinicians role Responsibility
29Practice
30- The serum test screens pregnant women for babies
with Downs syndrome. The test is a very good
one, but not perfect. Roughly 1 of babies have
Downs syndrome. If the baby has Downs syndrome,
there is a 90 chance that the result will be
positive. If the baby is unaffected, there is
still a 1 chance that the result will be
positive. A pregnant woman has been tested and
the result is positive. What is the chance her
baby actually has Downs syndrome?
Answer 47.4
Answered incorrectly by 20/21 OBs, 22/22
midwives, 21/22 pregnant women, and 17/20
companions
31Assuming 1,000 pregnant women are screened
TEST
1 of 990
10
90 identified
9
TEST -
980
1
10
990
1 of 1,000 women
32Evaluating a Study
33Are The Results Valid?
- Comparison with the Gold standard
- Blinded comparison
- Independent testing
34Are The Results Valid?
- Was the test applied to patients with a spectrum
of the disease in question (consecutive vs random
vs convenience sample)? - Is the test reasonable? Limited?
35Are The Results Valid?
- What are the results?
- Sensitivity, specificity and predictive values
- Likelihood ratio calculation
- Prevalence of disease in the study population
- Typical?
- Similar to your practice?
36Levels of POEMness for Diagnostic Tests
- Sensitivity specificity
- Does it change diagnoses?
- Does it change treatment?
- Does it change outcomes?
- Is it worthwhile (to patients and/or society)?
- (examples HbA1C for DM, CPK vs T4/PKU in
newborns, electron beam tomography for CAD, CRP,
BMD) - Fryback DG, Thornbury JR. The efficacy of
diagnostic imaging. Med Decis Making 1991
1188-94