Title: Diagnostic tests
1Diagnostic tests
Subodh S Gupta MGIMS, Sewagram
2Standard 2 X 2 table(For Diagnostic Tests)
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic test Positive (T) a b ab
Diagnostic test Negative (T-) c d cd
Total ac bd N
Gold Standard
3Standard 2 X 2 table(For Diagnostic Tests)
Disease Status Disease Status
Present (D) Absent (D-)
Diagnostic test Positive (T) TP FP
Diagnostic test Negative (T-) FN TN
Gold Standard
4Gold standard
- In any study of diagnosis, the method being
evaluated has to be compared to something - The best available test that is used as
comparison is called the GOLD STANDARD - Need to remember that all gold standards are not
always gold New test may be better than the gold
standard
5Test parameters
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) a b ab
Diagnostic Test Negative (T-) c d cd
Total ac bd N
- Sensitivity Pr(TD) a/(ac)
- --Sensitivity is PID (Positive In Disease)
- Specificity Pr(T-D-) d/(bd)
- --Specificity is NIH (Negative In Health)
6Test parameters
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) a b ab
Diagnostic Test Negative (T-) c d cd
Total ac bd N
Gold Standard
- False Positive Rate (FP rate) Pr(TD-)
b/(bd) - False Negative Rate (FN rate) Pr(T-D)
c/(ac) - Diagnostic Accuracy (ad)/n
7Test parameters
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) a b ab
Diagnostic Test Negative (T-) c d cd
Total ac bd N
Gold Standard
- Positive Predictive Value (PPV) Pr(DT)
a/(ab) - Negative Predictive Value (NPV) Pr(D-T-)
d/(cd)
8Test parameters Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 5 95
Diagnostic Test Negative (T-) 10 95 105
Total 100 100 200
Sensitivity 90/(9010), Specificity
95/(955) FP rate 5/ (955) FN Rate 10/
(9010) Diagnostic Accuracy (9095) /
(9010595) PPV 90/(905) NPV 95/(9510)
9PPV NPV with Prevalence
Sensitivity 90
Specificity 95
False Negative Rate 10
False Positive Rate 5
PPV 94.7
NPV 90.5
Diagnostic Accuracy 92.5
10(No Transcript)
11(No Transcript)
12Healthy population vs sick population
Healthy Sick
13Predictive Values in hospital-based data
14Predictive Values in population-based data
15Test Parameters Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 5 95
Diagnostic Test Negative (T-) 10 95 105
Total 100 100 200
Prevalence 50 PPV 94.7 NPV
90.5Diagnostic Accuracy 92.5
16Test Parameters Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 95 185
Diagnostic Test Negative (T-) 10 1805 1815
Total 100 1900 2000
Prevalence 5 PPV 48.6 NPV
99.4Diagnostic Accuracy 94.8
17Test Parameters Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 995 1085
Diagnostic Test Negative (T-) 10 18905 18915
Total 100 19900 20000
Prevalence 0.5 PPV 8.3 NPV
99.9Diagnostic Accuracy 95
18Test Parameters Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 9995 10085
Diagnostic Test Negative (T-) 10 189905 189915
Total 100 199900 200000
Prevalence 0.05 PPV 0.9 NPV
100Diagnostic Accuracy 95
19PPV NPV with Prevalence
Prevalence 50 5 0.5 0.05
Sensitivity 90 90 90 90
Specificity 95 95 95 95
PPV 94.7 48.6 8.3 0.9
NPV 90.5 99.4 99.9 100
Diagnostic Accuracy 92.5 94.8 95 95
20Trade-offs between Sensitivity and Specificity
21Sensitivity and Specificity solve the wrong
problem!!!
- When we use Diagnostic test clinically, we do not
know who actually has and does not have the
target disorder, if we did, we would not need the
Diagnostic Test. - Our Clinical Concern is not a vertical one of
Sensitivity and Specificity, but a horizontal one
of the meaning of Positive and Negative Test
Results.
22When a clinician uses a test, which question is
important ?
- If I obtain a positive test result, what is the
probability that this person actually has the
disease? - If I obtain a negative test result, what is the
probability that the person does not have the
disease?
23Test parameters
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) a b ab
Diagnostic Test Negative (T-) c d cd
Total ac bd N
- Sensitivity Pr(TD) a/(ac)
- Specificity Pr(T-D-) d/(bd)
- PPV Pr(DT) a/(ab)
- NPV Pr(D-T-) d/(cd)
24Likelihood Ratios
- Likelihood Ratio is a ratio of two probabilities
- Likelihood ratios state how many time more (or
less) likely a particular test results are
observed in patients with disease than in those
without disease. - LR tells how much the odds of the disease
increase when a test is positive. - LR- tells how much the odds of the disease
decrease when a test is negative
25- The likelihood ratio for a positive result (LR)
tells how much the odds of the disease increase
when a test is positive. - The likelihood ratio for a negative result (LR-)
tells you how much the odds of the disease
decrease when a test is negative
26Likelihood Ratios
The LR for a positive test is defined as LR ()
Prob (TD) / Prob(TND) LR () TP/(TPFN)
FP/(FPTN) LR () (Sensitivity) /
(1-Specificity)
27Likelihood Ratios
The LR for a negative test is defined as LR (-)
Prob (T-D) / Prob(T-ND) LR (-) FN/(TPFN)
TP/(FPTN) LR (-) (1-Sensitivity) /
(Specificity)
28What is a good Likelihood Ratios?
- A LR () more than 10 or a LR (-) less than 0.1
provides convincing diagnostic evidence. - A LR () more than 5 or a LR (-) less than 0.2 is
considered to give strong diagnostic evidence.
29Likelihood Ratio Example
Gold Standard
Disease Status Disease Status
Present (D) Absent (D-) Total
Diagnostic Test Positive (T) 90 5 95
Diagnostic Test Negative (T-) 10 95 105
Total 100 100 200
Likelihood Ratio for a positive test (90/100) /
(5/100) 90/ 5 18 Likelihood Ratio
for a negative test (10/100) / (95/100)
10/ 95 0.11
30Exercise
- In a hypothetical example of a diagnostic test,
serum levels of a biochemical marker of a
particular disease were compared with the known
diagnosis of the disease. 100 international units
of the marker or greater was taken as an
arbitrary positive test result
31Example
Disease Status Disease Status
Present Absent Total
Marker gt100 431 30 461
Marker lt100 29 116 145
Total 460 146 606
32Exercise
- Initial creatine phosphokinase (CK) levels were
related to the subsequent diagnosis of acute
myocardial infarction (MI) in a group of patients
with suspected MI. Four ranges of CK result were
chosen for the study
33Exercise
Disease Status Disease Status
Present Absent Total
CPK gt280 97 1 98
CPK 80-279 118 15 133
40-79 13 26 39
1-39 2 88 100
Total 230 130 360
34(No Transcript)
35Odds and Probability
Disease Status Disease Status Disease Status
Present Absent Total
a b ab
Probability of Disease ( with disease) / (
with without disease) a/ (ab) Odds of a
disease ( with disease) / ( without disease)
a/ b Probability Odds/ (Odds1) Odds
Probability / (1-Probability)
36Use of Likelihood Ratio
Employment of following three step procedure 1.
Identify and convert the pre-test probability to
pre-test odds. 2. Determine the post-test odds
using the formula, Post-test Odds Pre-test
Odds Likelihood Ratio 3. Convert the post-test
odds into post-test probability.
37Likelihood Ratio Example
- A 52 yr woman presents after detecting 1.5 cm
breast lump on self-exam. On clinical exam, the
lump is not freely movable. If the pre-test
probability is 20 and the LR for non-movable
breast lump is 4, calculate the probability that
this woman has breast cancer.
38Likelihood Ratio Solution
- First step
- Pre-test probability 0.2
- Pre-test odds Pre-test prob / (1-pre-test prob)
- Pre-test odds 0.2/(1-0.2) 0.2/0.8 0.25
- Second step
- Post-test odds Pre-test odds LR
- Post-test odds 0.254 1
- Third step
- Post-test probability Post-test odds / (1
Post-test odds) - Post-test probability 1/(11) ½ 0.5
39(No Transcript)
40(No Transcript)
41Receiver Operating Characteristic (ROC)
- Finding a best test
- Finding a best cut-off
- Finding a best combination
42(No Transcript)
43ROC curve constructed from multiple test
thresholds
44Receiver Operating Characteristic (ROC)
- ROC Curve allows comparison of different tests
for the same condition without (before)
specifying a cut-off point. - The test with the largest AUC (Area under the
curve) is the best.
45(No Transcript)
46(No Transcript)
47Features of good diagnosis study
- Comparative (compares new test against old test).
- Should be a gold standard
- Should include both positive and negative results
- Usually will involve blinding for both patient,
tester and investigator.
48Gold standard
- In any study of diagnosis, the method being
evaluated has to be compared to something - The best available test that is used as
comparison is called the GOLD STANDARD - Need to remember that all gold standards are not
always gold New test may be better than the gold
standard
49Typical setting for finding Sensitivity and
Specificity
- Best if everyone who gets the new test also gets
gold standard - Doesnt happen in the real world
- Not even a sample of each (case-control type)
- Case series of patients who had both tests
50Setting for finding Sensitivity and Specificity
- Sensitivity should not be tested in sickest of
sick - Should include spectrum of disease
- Specificity should not be tested in healthiest
of healthy - Should include similar conditions.
51Precision
- How precise are the estimates of Sensitivity,
Specificity, False Positive Rate, False Negative
Rate, Positive Predictive Value and Negative
Predictive Value? - If reported without a measure of precision,
clinicians cannot know the range within which the
true values of the indices are likely to lie. - When evaluations of diagnostic accuracy are
reported the precision of test characteristics
should be stated.
52Sample size for adequate sensitivity
53Sample size for adequate specificity
54Exercise
- Dr Egbert Everard wants to test a new blood test
(Sithtastic) for the diagnosis of the dark side
gene. He wants the test to have a sensitivity of
at least 70 and a specificity of 90 with 5
confidence levels. Disease prevalence in this
population is 10. (i) How many patients does
Egbert need to be 95 sure his test is more than
70 sensitive? (ii) How many patients does
Egbert need to be 95 sure that his test is
more than 90 specific?
55Biases in Research on Diagnostic Tests
- Observer Bias
- Spectrum Bias
- Reference Test Bias
- Bias Index
- Work-Up (Verification Bias)
- Diagnostic Suspicion Bias
-
56Observer bias
- Blinding
- Investigators should be blinded to the test
results when interpreting the reference test, and
blinded to the reference test results when
interpreting the test. - Should they also be blinded to other patient
characteristics?
57Spectrum bias
- Indeterminate results dropped from analysis
58Reference Test Bias
- What if the Gold Standard is not gold after
all? - Absence of Gold standard
- Methods to deal with the absence of a gold
standard - Correcting for Reference Test Bias (Gart Buck)
- Bayesian estimations (Joseph, Gyorkos, Coupal)
- Latent class modeling (Walter, Cook, Irwig)
59BIAS INDEX
- What if the test itself commits a certain types
of errors more commonly than the other? - BI (b-c)/N
60Work-up (Verification Bias)
- Occurs when a test efficacy study is restricted
to patients in whom the disease status is known. - A study by Borow et al (Am Heart J,1983) on
patients who were referred for valve surgery on
the basis of echocardiographic assessment
reported excellent diagnostic agreement between
the findings at echocardiography and at surgery.
61Review Bias
- The Test and Gold Standard should follow a
randomized sequence of administration. - This tends to offset the Diagnostic Suspicion
Bias that may creep in, when the Gold Standard is
always applied and interpreted last. - It will also balance any effect of time on
rapidly increasing severity of the disease and
thereby avoid a bias towards more positives in
the test which is performed later.
62Ethical Issues in Diagnostic Test Research
- Invasive techniques
- Labeling
- Confidentiality
- Human subjects
63QUALITIES OF STUDIES EVALUATING DIAGNOSTIC TESTS
- Reid MC et al. Use of methodological standards in
diagnostic test research getting better but
still not good. JAMA 1995 274 645. - Review of studies published between 1990-93.
- Work-up Bias 38 Studies
- Observer Bias (Blinding) 53 Studies
- Bias from Indeterminate Results 62 Studies
- No assessment of variability across test
observers, test instruments, or time 68 Studies
64QUALITIES OF STUDIES EVALUATING DIAGNOSTIC TESTS
- Small Sample Size, with no description of
Confidence Intervals 76 Studies - Patient Characteristics not described 68
Studies - Possible Interactions or Effect Modification
Ignored 88 Studies - Only two (6) of 34 articles published from
1990-1993 (N Engl J Med, JAMA, Lancet, BMJ) met
six or more of the Standards.
65USERS GUIDES TO THE MEDICAL LITERATURE
- How to use an Article about a Diagnostic Test?
- Are the results of the study valid?
- What are the results and will they help me in
caring for my patients?
66Methodological Questions for Appraising Journal
Articles about Diagnostic Tests
1. Was there an independent, blind comparison
with a gold standard of diagnosis? 2. Was the
setting for the study as well as the filter
through which the study patients passed,
adequately described? 3. Did the patient sample
include an appropriate spectrum of disease? 4.
Have they done analysis of the pertinent
subgroups 5. Where the tactics for carrying out
the test described in sufficient detail to permit
their exact replication?
676. Was the reproducibility of the test result
(precision) and its interpretation (observer
variation) determined? 7. Was the term normal
defined sensibly? 8. Was precision of the test
statistics given? 9. Was the indeterminate test
results presented? 10. If the test is advocated
as a part of a cluster or sequence of tests, was
its contribution to the overall validity of the
cluster or sequence determined? 11. Was the
utility of the test determined?
68Thank you