Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation

1 / 181
About This Presentation
Title:

Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation

Description:

Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation Richard Frederick, Ph.D., ABPP (Forensic) US Medical Center for Federal Prisoners –

Number of Views:206
Avg rating:3.0/5.0
Slides: 182
Provided by: BOP19
Category:

less

Transcript and Presenter's Notes

Title: Solving Classification Problems for Symptom Validity Tests with Mixed Groups Validation


1
Solving Classification Problems for Symptom
Validity Tests with Mixed Groups Validation
  • Richard Frederick, Ph.D., ABPP (Forensic)
  • US Medical Center for Federal Prisoners
  • Springfield, Missouri

2
I am not a neuropsychologist.
My view of brain
Your view of brain
3
My board certifications
  • Forensic Psychology
  • American Board of Professional Psychology
  • Assessment Psychology
  • American Board of Assessment Psychology

4
My professional goal
  • Use tests properly in forensic psychological
    assessments

5
Goals of workshop
Participants in this workshop will be able to
employ Excel graphing methods --to evaluate
classification characteristics of symptom
validity tests --to adapt symptom validity test
scores to their individual, local, base
rates --to combine information from local base
rate and multiple symptom validity tests
6
richardfrederick.com
7
Something is terribly wrong
  • The SIRS has sensitivity .485 and specificity
    .995.
  • The SIRS was administered to 131 criminal
    defendants
  • who were strongly suspected of feigned
    psychopathology.
  • 68 of them were categorized as feigning by the
    SIRS

8
What is a classification test? A structured
routine for determining which individuals belong
to which of two groups.
9
  • There are two groups.
  • (2) Its not easy to determine which
  • group an individual belongs to
  • without the help of the test.

10
Real World
11
The distributions represent our estimations of
how the populations of the two groups score on
the test. We generally estimate the population
distributions by sampling. We notice that the
populations have two separate, but overlapping
distributions. The extent of the overlap is of
concern to us.
12
  • Questions that must be addressed in
  • research before we can continue
  • Are there really two separate groups?
  • Can we effectively represent the
  • population distributions by sampling?

13
Real World
14
What we notice next. The mean separation between
the groups is 10 points. Persons in Population
A have a mean score that is 10 points below
persons in Population B. The sd for each
population is the same. The mean separation
between groups is one sd.
15
When researchers talk about mean separation,
they often refer to effect size. Often,
Cohens d is the statistic used to refer to
standardized mean separation. Here, Cohens d
1. This is often referred to as a large, or
very large, effect size.
16
Mean separation 0
Making tests often means finding those
characteristics that best separate the
distributions of the two groups.
Two distributions of gender with respect
to Intelligence
17
Moderately large mean separation
Two distributions of gender with respect
to Longevity
18
Large mean separation
Two distributions of gender with respect to Hair
Length
19
Very large mean separation
Two distributions of gender with respect to Body
Mass
20
Real World
21
  • Summary
  • We have two groups.
  • We have a test for which the two
  • groups score differentially.
  • (3) The differences in mean scores
  • represents a very large effect.

22
Foundations of TPR and FPR
23
(No Transcript)
24
More commonly, researchers report Sensitivity
and Specificity. These terms are common, but
not most helpful. We are going to use the
terms True Positive Rate (TPR) and False
Positive Rate (FPR). TPR Sensitivity FPR 1
- Specificity
25
What are TPR and FPR? TPR is the proportion of
individuals who do have the condition who
generate positive scores. TPR is the rate of
scores are beyond the cut in the direction that
indicates the presence of the condition. FPR is
the proportion of individuals who do NOT have the
condition who generate positive scores. FPR is
the rate of scores beyond the cut in the
direction that indicates the presence of the
condition.
26
The green line represents the cut score. Scores
to the LEFT of the line are classified
NEGATIVE. Scores to right are classified
POSITIVE.
Have nots
Haves
Here, the False Positive Rate is 92.4. The True
Positive Rate is 100. As we move the line to
the right, both rates DECREASE.
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
To totally eliminate false positives, we have to
be willing to identify almost no one as a
positive.
31
Test/ /Truth Have disorder Dont
Has disorder True Positives False Positives Positives
Doesnt False Negatives True Negatives Negatives
Haves Have Nots
TPR True Positives/Haves FPR False
Positives/Have Nots
32
Haves
Have nots
33
A positive score will be one that is associated
with Population A membership. If we set a point
at which a score will be used to say, This
score represents Population A, such a score
will be referred to as a positive score. A
positive score can be a true positive or a false
positive unknown to us.
34
The True Positive Rate is the proportion of
Population A members who generate a positive
score. In our figure, the point at which we
begin to identify positive scores is at 50,
the mean of population A. Scores at or below 50
are called positive, and a person who generates
a positive score is classified as a Population A
member.
35
We can pick any value to be our cut score, but
its hard to pick one that doesnt result in
some Population B members producing positive
scores. In our figure, 50 of the Population A
members have scores at 50 or below. This is the
True Positive Rate. TPR .50. In our figure,
16 of the Population B members have scores at
50 or below. This it the False Positive Rate.
FPR .16.
36
(No Transcript)
37
We note that it is not the test that has a
certain TPR and FPR. It is the chosen test score
that has a certain TPR and FPR. A different
test score will almost certainly have different
TPR and FPR.
38
Overcoming limiting factors of known groups
validation in determining test score sensitivity
and specificity
39
We think of a test as a way to characterize a
dependency. As you have more of X, you have more
of Y. Y depends on X. X predicts Y. X is some
construct. Y is some test score. There is a
relationship that we wish to characterize and
quantify.
40
Lets consider feigning. As you are more likely
to feign, you are more likely to engage in
certain behavior. This behavior might be
providing answers to items on a test at a
certain rate. You might choose more items, you
might choose fewer items than normals.
41
We develop the idea that we can identify
individuals who respond at a certain rate as
feigners, and we decide to make a decision point
about when we call test takers feigners and when
we dont. We call that decision point a cut
score. We call test scores at or beyond the cut
score positive scores Some positive scores
are correct true positives Some positive
scores are incorrect false positives
42
If our test is any good, and if the relationship
between X and Y is strong, then our rate of true
positives is much higher than our rate of false
positives. Lets skip to the end. We are now
using the test in our clinic. We look over our
results. We see a number of positive
scores. We know that those positive scores
are some unknown mixture of true positives and
false positives. Wed like to know what that
ratio of that mixture is.
43
Heres how we do it First, we estimate what the
true positive rate of the cut score is. Then,
we estimate what the false positive rate of the
cut score is. Then, we figure out what
percentage of people in our sample are
feigning. Then we can get the ratio of the
mixture of our true positive and false positives
in all the positive scores in our clinic. (We
call this positive predictive power.)
44
Getting TPR and FPR We depend on researchers to
tell us what the estimates of true positive rate
and false positive rate are. They usually do
this through a process called criterion groups
validation. People with more confidence than
might be called for refer to this process as
known groups validation.
45
The process is seemingly straightforward. Identif
y two groups. One group has the condition. All
the positives in this group are true
positives. One group doesnt have the
condition. All the positives in this group are
false positives. The rate of true positives
is the sensitivity of the test. TPR
sensitivity. The rate of false positives is
the non-specificity of the test. FPR 1
specificity.
46
There are many problems with this process, but
lets focus on the main two. Problem 1 In
Study 1, for a given cut score, researchers
report the TPR is .67 and the FPR is .12. In
Study 2, for the same cut score, researchers
report TPR .58 and FPR .09. Which values do
you use?
47
Problem 2 In Study 1, for a given cut score,
researchers report the TPR is .67 and the FPR is
.12. In Study 2, for a different cut score,
researchers report TPR .58 and FPR
.09. Which cut score do you use?
48
Known groups validation
49
Lets validate a test!
50
God whispers to us what truth is and we identify
100 honest responders and 100 feigners.

100 100
51
We take our best shot at a test.
TRUTH



100 100
TEST
52
Test results
TRUTH

49 1 50
51 99 150
100 100
TEST
53
We say for our testTrue positive rate 49/100
49 sens 49False positive rate 1/100
1 specificity 99
TRUTH

49 1 50
51 99 150
100 100
TEST
54
Because God does not whisper to us anymore,we
take this test, our best test, and we say, This
is the best we can do. Lets call it our Gold
Standard.We will now make criterion groups
with this test,and we will call the groups
Known Groups.We will then validate tests,
based on these Known Groups.
55
We say for our testTrue positive rate 49/100
49 sensitivity 49False positive rate
1/100 1 specificity 99
TRUTH

49 1 50
51 99 150
100 100
TEST
56
Our move from TRUTH to KNOWN GROUPS
KNOWN GROUPS

49 51 100
1 99 100
50 150
TRUTH
57
We forget what truth is and develop faith in our
gold standard
KNOWN GROUPS



50 150
58
Lets validate a new test, which just happens to
be a perfect test. What test diagnostic
efficiencies will we assign our new, perfect,
test?
KNOWN GROUPS

49 51 100
1 99 100
50 150
PERFECT TEST
59
Lets validate a new test, which just happens to
be a perfect test. What test diagnostic
efficiencies will we assign our new, perfect,
test?
KNOWN GROUPS

49 51 100
1 99 100
50 150
Our belief that we can make perfect criterion
groups from imperfect criteria has led us to
misunderstand tremendously what we are doing.
PERFECT TEST
TPR 49/50 98, FPR 51/150 34
60
Lets begin to address these problems in a
non-traditional way.
61
Table for Computation of Test Characteristics
Positive (Feigners) Negative (Not Feigning)
Test Positive 80 10 Computation for Positive Predictive Power
Test Negative 20 90 Computation for Negative Predictive Power
Sensitivity 80 Specificity 90
62
Table for Computation of Test Characteristics
Positive (Feigners) Negative (Not Feigning)
Test Positive 80 10 PPP Ratio of True Positives to All Positives
Test Negative 20 90 NPP Ratio of True Negatives to All Negatives
True Positive Rate (TPR) 80 False Positive Rate (FPR) 10
63
Table for Computation of Test Characteristics
Positive (Feigners) Negative (Not Feigning)
Test Positive 80 10 PPP Ratio of True Positives to All Positives
Test Negative 20 90 NPP Ratio of True Negatives to All Negatives
True Positive Rate (TPR) 80 False Positive Rate (FPR) 10
64
Table for Computation of Test Characteristics
Base Rate of Feigning Base Rate of Feigning
100 0
Test Positive 80 10
Test Negative 20 90
NOTE Calculations of TPR and FPR are INDEPENDENT of Base Rate True Positive Rate (TPR) 80 False Positive Rate (FPR) 10
65
Table for Computation of Test Characteristics
Base Rate of Feigning Base Rate of Feigning
100 0
Test Positive 80 10
True Positive Rate (TPR) 80 False Positive Rate (FPR) 10
66
Table for Computation of Test Characteristics
Base Rate of Feigning Base Rate of Feigning
1.00 0
Proportion Tests Positive .80 .10
True Positive Rate (TPR) .80 False Positive Rate (FPR) .10
67
(No Transcript)
68
REMINDER Here is what we are working
onfiguring out which positives in our clinic are
true positives. First, we estimate what the true
positive rate of the cut score is. Then, we
estimate what the false positive rate of the cut
score is. Lets do that part now. Then, we
figure out what percentage of people in our
sample are feigning. Then we can get the ratio
of the mixture of our true positive and false
positives in all the positive scores in our
clinic.
69
Mixed groups validation
70
Table for Computation of Test Characteristics
Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering
1.0 .8 .6 .4 .2 0
Pr Tests .8 .66 .52 .38 .24 .1
TPR .8 (.8, .6, .4, .2 are mixed groups, not pure) (.8, .6, .4, .2 are mixed groups, not pure) (.8, .6, .4, .2 are mixed groups, not pure) (.8, .6, .4, .2 are mixed groups, not pure) FPR .1
71
Table for Computation of Test Characteristics
Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering
0 .2 .4 .6 .8 1
Pr Tests .1 .24 .38 .52 .66 .8
FPR .1 TPR .8
72
(No Transcript)
73
When BR 0, 10 of scores positive, all false
positives
74
When we say FPR .16 and TPR .50, what
were saying is that, no matter what samples we
test, we expect to see no fewer than 16 positive
scores and no more than 50 positive scores.
Movement along this line from left to
right represents increasing rate of Population A
and increasing rate of positive scores.
75
FPR .10 TPR .80
76
(No Transcript)
77
Table for Computation of Test Characteristics
Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering
0 .2 .4 .6 .8 1
Pr Tests .24 .38 .52 .66

78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
TOMM No simulation studies FPR .056, SE
.025 TPR .742, SE .093
82
For any imperfect test, PPP ranges from 0 to
1 as base rate ranges from 0 to 1 NPP ranges
from 0 to 1 as base rate ranges from 1 to 0
NPP
PPP
83
Using MGV to estimate test diagnostic
efficiencies of the Reliable Digit Span
  • Laurie Ragatz, PhD
  • Richard Frederick, PhD

84
What is Reliable Digit Span?
  • RDS is a symptom validity measure for Digit Span.
    The value of RDS is derived by adding longest
    strings of two trials passed for both
  • forward and backward Digit Span.
  • Researched cut scores include 5 or lower, 6 or
    lower, 7 or lower, or 8 or lower.

85
Reliable Digit Span Example
Forward Digit Span Correct Incorrect
1 4 Correct Incorrect
2 5 Correct Incorrect
5 7 1 Correct Incorrect
8 3 4 Correct Incorrect
5 9 4 6 Correct Incorrect
7 2 3 9 Correct Incorrect
Backward Digit Span Backward Digit Span Correct Incorrect
Example Correct Answer Correct Incorrect
1 2 2 1 Correct Incorrect
7 4 4 7 Correct Incorrect
5 3 9 9 3 5 Correct Incorrect
8 2 4 4 2 8 Correct Incorrect
  • Directions Examinee recalls numbers in the
    reverse order they were provided by the examiner
  • Directions Examinee recalls numbers in the same
    order they were provided by the examiner

Reliable Digit Span 4 3 7
86
(1) We found all available articles dealing with
RDS and identified the cut scores investigated.
We included simulator studies. (2) Based on
the authors decision about criterion group
membership, we calculated the overall base rate
of malingering in the study. (3) We observed
the overall rate of positive scores in the study
at the identified cut score.
87
(4) We did not include any data for persons with
mental retardation. The rate of positive scores
among persons with mental retardation was
exceedingly high for all cut scores.
88
Criterion group
Example Smith (2010) reported 203 TOMMs at cut lt 45. Is malingering Is not malingering Total
Test score positive 42 15 57
Test score negative 21 125 146
Total 63 140 203
Test outcome
We have 63 malingerers in a sample of 203. BR
63/203 0.31. We have 57 positive scores.
Proportion positive scores (PPS) is 57/203 .28.
For this study, we plot (BR, PPS) (.31, .28)
x .31, y .28. Our n for WLS 203.
89
RDS 5 or lower
Using weighted least squares regression (with N
as the weight), we regressed Proportion Positive
Scores (PPS) on Base Rate (BR) to generate the
Proportion Positive Score Line. We obtained
y-intercept of -.015 (all negative values are
truncated to 0), and slope of .265.
90
RDS 5 or lower
Study N BR PPS
1 96 0.49 0.052
2 54 0.444 0.093
3 157 0.223 0.057
4 60 0.333 0.083
5 65 0.554 0.215
6 62 0.532 0.113
7 133 0.113 0.008
scatterplot
put these data in WLS to obtain regression line
characteristics
91
RDS 5 or lower, FPR 0, TPR .265
92
RDS 6 or lower
y-intercept .015, slope .419
93
RDS 6 or lower, FPR .015, TPR .434
94
RDS 7 or lower
y-intercept .187, slope .39
95
RDS 7 or lower, FPR .187, TPR .618
96
RDS 8 or lower
y-intercept .236, slope .824
97
RDS 8 or lower, FPR .236, TPR .824
98
As we move from a cut score of 5 or lower to 6
or lower, we obtain substantial improvement in
TPR estimate with little cost in FPR increase.
99
Our choice for best cut score for RDS RDS 6 or
lower, FPR .015, TPR .434
100
Cut score FPR TPR
5 or lower 0 (.038) .25 (.07)
6 or lower .015 (.053) .434 (.082)
7 or lower .187 (.102) .618 (.155)
8 or lower .236 (.112) .824 (.190)
By using WLS regression, we can obtain standard
errors of our estimates of FPR and TPR. So, new
researchers can test hypotheses about parametric
values of FPR and TPR.
101
Overcoming limiting factors of known groups
validation in determining test score sensitivity
and specificity
102
  • Summary
  • The TVS and MGV allow powerful research into
    existing published data sets. Summary data are
    used.
  • Understanding of parametric values of TPR and FPR
    is facilitated when researchers publish results
    on a variety of cut scores that should be
    considered. A frequency
  • distribution would be ideal, for example,

RDS n RDS n RDS n
0 5 5 7 10 88
1 0 6 51 11 74
2 0 7 68 12 61
3 1 8 79 13 32
4 3 9 98 14 12
103
  • Combining studies in this way allows us to
    generate
  • stable values of TPR and FPR with SEs so that
    new research
  • can test those values.
  • Researchers should focus on the basis for
    estimating BRs
  • in their research groups. All research
    estimating FPR and TPR
  • is vulnerable to error when the purity of
    research groups is
  • overestimated. Working towards a reliable
    estimate of
  • mixed group base rate will facilitate better
    validation studies.

104
Reliably estimate local base rates of feigning
for proper allocation of sensitivity and
specificity information
105
  • How can the Test Validation Summary help me
  • determine my local BR?
  • Get the best estimate of the test FPR and TPR
  • for a certain test score.
  • 2. Find the proportion of test scores in your
  • sample that are positive scores.

106
(No Transcript)
107
(No Transcript)
108
From a sample, observe rate of positive
scores. Use TVS to estimate condition BR in that
sample, PPP and NPP for that BR.
527 criminal defendants who took RMT and
VIP concurrently Rate of positive scores in
this sample was .113 PPP .814 1 NPP .077
109
TOMM No simulation studies FPR .056, SE
.025 TPR .742, SE .093
110
Beth A. Caillouet, Bernice A. Marcopulos, Jesse
G. Brand, Julie Ann Kent, Richard I. Frederick
Question What are the BRs of malingering in the
two samples?
111
Question What are the BRs of malingering in the
two samples?
Information needed Estimates of TOMM FPR and
TPR. From TOMM TVS, we get FPR .056, TPR
742. Sample 1 Secondary gain present.
Proportion positive scores 55/220
.25. Sample 2 Secondary gain absent.
Proportion positive scores 34/299 .11. Use
TOMM TVS to estimate BR of each sample.
112
When PPS .25, BR .28.
When PPS .11, BR .08.
113
Defensibly choose symptom validity cut scores
that are ideally suited for their local base rates
114
M-FAST
115
(No Transcript)
116
(No Transcript)
117
Malingering Genuine
MFAST gt 5
MFAST lt 6
86
TPR .93 FPR .17 BR malingering 35, N 86
118
Malingering Genuine
MFAST gt 5
MFAST lt 6
.35(86) .65(86) 86
TPR .93 FPR .17 BR malingering 35, N 86
119
Malingering Genuine
MFAST gt 5 .93(30) .17(56)
MFAST lt 6
30 56 86
TPR .93 FPR .17 BR malingering 35, N 86
120
Malingering Genuine
MFAST gt 5 28 9.52
MFAST lt 6 2 46.48
30 56 86
TPR .93 FPR .17 BR malingering 35, N 86
121
Malingering Genuine
MFAST gt 5 28 10
MFAST lt 6 2 46
30 56 86
TPR .93 FPR 10/56 .18 BR malingering
35, N 86
122
Malingering Genuine
MFAST gt 5 28 10 38
MFAST lt 6 2 46 48
30 56 86
TPR 28/30 .93 FPR 10/56 .18 PPP 28/38
.737 NPP .958 BR malingering .35
NPP PPP 1 FPR TPR
123
Malingering Genuine
MFAST gt 5 28 103 131
MFAST lt 6 2 467 469
30 570 600
TPR 28/30 .93 FPR 10/56 .18 PPP 28/131
.213 NPP .996 BR malingering .05
NPP PPP 1 FPR TPR
124
Test validation summary for M-FAST cut score
recommended by test manual.
PPP does not even reach 50 correct decisions
until BR gt .16
M-FAST gt 5 FPR .17 TPR .93
At recommended cut score FPR very high
125
At BR .05, PPP does not exceed .50 until cut
score adjusted to gt 9 on M-FAST
126
Combining information from local base rate and
multiple symptom validity tests
You can get estimates of PPP and NPP for the
sample you work withIF you can reliably
estimate the BR.
127
737 defendants were administered Rey 15 Item
Memory Test (RMT)memorize and reproduce 15
itemsvery easy test. Score is items reproduced
(0 to 15) Word Recognition Test (WRT)memorize
15 words, identify those 15 and correctly reject
15 from a list of 30. Score is number of hits
and correct rejections (0 to 30)
128
RMT validating using MGV with clinical probability
judgments. FPR .025 TPR .574 Frederick
Bowden, 2009
129
RMT lt 9 FPR .025 TPR .574
We found 726 defendants who completed BOTH RMT
and WRT. 81/726 failed the RMT .111 proportion
positive score. By observation of TVS, then BR
.16, PPP .814, NPP .923
130
From a sample, observe rate of positive
scores. Use TVS to estimate condition BR in that
sample, PPP and NPP for that BR.
527 criminal defendants who took RMT and
VIP concurrently Rate of positive scores in
this sample was .113 PPP .814 1 NPP .077
131
We found 726 defendants who completed BOTH RMT
and WRT. 81/726 failed the RMT .111 proportion
positive score. By observation of TVS, then BR
.16, PPP .814, NPP .923 If PPP .814, then
in this sample, the probability of feigning if
RMT is positive, is .814. If NPP .923, then
in this sample, the probability of feigning if
RMT is negative is .077, or 1 - .923.
  • To conduct MGV, we sampled from two groups
  • The 645 individuals who passed the RMThad a
    negative score.
  • The 81 individuals who failed the RMThad a
    positive score.

132
Example of sampling
645 individuals with negative scores, p(mal)
.077
81 individuals with positive scores, p(mal) .814
Sample n 360
Sample n 40
400 cases, 10 failures, 90 passes
Overall p(mal) 40.814 360.077 .151
Sample 25 times, plot x .151, y observed rate
of positive WRT scores, n for WLS 400
133
Group Ratio Failures Passes N BR Samples
1 0 0 645 645 0.077 1
2 0.1 40 300 400 0.1507 25
3 0.2 40 160 200 0.2244 25
4 0.3 40 93 133 0.2981 25
5 0.4 40 60 100 0.3718 25
6 0.5 40 40 80 0.4455 25
7 0.6 40 27 67 0.5192 25
8 0.7 40 17 57 0.5929 25
9 0.8 40 10 50 0.6666 25
10 0.9 40 4 44 0.7403 25
11 1 81 0 81 0.814 1
For each sample, BR was pre-estimated. Then we
observed rate of positive WRT scores at each
potential cut score.
134
Word Recognition Test (WRT) Range 4 to 30, Mean
23.2 Within group of RMT lt 9, mean
18.7 Within group of RMT gt 8, mean 23.8
135
Word Recognition Test (WRT) For every potential
cut score of WRT (4 -30), we plotted all x, y
pairs obtained from sampling We performed WLS
to obtain the FPR and TPR estimates at every
potential cut score. We plotted the FPR and TPR
estimates at every potential cut score to
generate the ROC curve. AUC .905, SE .012,
95 CI for AUC .881-.930. Best cut scores
LTE 18 (TPR .563, FPR .034) LTE 19 (TPR
.620, FPR .066)
136
(No Transcript)
137
We plotted the FPR and TPR estimates at every
potential cut score to generate the ROC
curve. AUC .905, SE .012, 95 CI for AUC
.881-.930. Best cut scores LTE 18 (TPR
.563, FPR .034) LTE 19 (TPR .620, FPR
.066)
TPR
FPR
WORD RECOGNITON TEST (WRT)
138
  • Summary
  • We can use tests to form mixed groups for
    validation.
  • The best estimates of FPR and TPR for a test cut
    score
  • allow us to estimate PPP and NPP at our sample
    BR.
  • Instead of known groups design (which is
    misleading),
  • we do not presume to know (or care) about the
    status of
  • any individual. We assign individuals
    probabilities of
  • having the condition based on their test score.
  • Mixed groups have an overall probability of
    having the
  • condition, which is the average of the
    individual probabilities.
  • We do not need to be certain about group
    memberships.
  • We gain much flexibility by working with
    probabilities of having
  • the condition vs. certainties of having the
    condition.

139
Another example
140
Dawes 1967 showed that valid probability
judgments are excellent base rate indicators.
His work was substantiated in Frederick 2000 and
Frederick and Bowden 2009. To conduct MGV, we
formed groups of defendants for whom individuals
ratings of likelihood of malingering psychosis
were generated by forensic psychologists, before
any testing took place. The BR of malingered
psychosis for each group was then the mean of the
probability rating. If each member of the group
had been rated as 10 likely to feign psychosis,
then the BR of the group was estimated to be 10.
141
We then observed the hit rate (proportion
positive scores) for the groups for a variety of
F-family indicators of feigning on the MMPI-2
and MMPI-2-RF. We formed 15 groups of 30
individuals. For each group, we had a static
base rate, which was the mean of the probability
judgments assigned before testing. Within each
group, we iteratively observed the hit rate of
positive F-family indicators at each potential
cut score. Using the BR estimate and the
proportion positive scores at each potential
cut score, we performed WLS to generate estimates
of FPR and TPR. From these estimates, we
generated ROC curves.
142
15 groups, 30 defendants in each group, 450
defendants Each defendant rated from 0 to 100
before testing, with respect to likelihood he
would feign psychosis. Groups were formed after
first sorting individuals by ratings, from lowest
to highest. Mean ratings of groups (each group,
n 30) 0 0 1.2 4.2 5.0 5.0 5.0 5.0 8.1 10 14.
5 22.2 30.3 45.7 72.3 Rates of positive F-family
scores at each potential cut observed.
143
Scale AUC SE 95CI
F .904 .015 .874-.933
Fp .870 .018 .834-.906
Fp (no L items) .905 .015 .877-.934
F-r .940 .011 .919-.962
Fp-r .926 .013 .901-.950
Estimates by Nicholson, Mouton, Bagby, Buis,
Peterson, and Buigas (1998) AUCs and SE
F (.929, .021) Fp (.885, .027)
144
Scale Cutoff FPR TPR
F GTE 28 .043 .635
Fp GTE 8 .054 .484
Fp (no L items) GTE7 .055 .537
F-r GTE20 .050 .640
Fp-r GTE8 .055 .652
145
  • Summary
  • Using the estimates of likelihood of feigning
    based only
  • on clinician judgment prior to testing did not
    result in
  • random results. We can assume that mean
    probability
  • judgments were effective base rate estimates.
  • Our estimates of F and Fp are consistent with
    estimates
  • in large, well-validated analysis.
  • In this study, MMPI-2-RF indicators have higher
    mean
  • AUC and lower SE than their MMPI-2 counterparts.

146
Scale Cutoff FPR TPR
F GTE 28 .043 .635
Combine information about F with the SIRS-2
147
f Frequency Percent Valid Percent Cumulative
Percent Valid 4 4 2.7 3.1 3.1 6 1 .7 .8 3.8 9
1 .7 .8 4.6 10 2 1.3 1.5 6.1 11 3 2.0 2.3 8.4
12 2 1.3 1.5 9.9 13 3 2.0 2.3 12.2 14 4 2.7
3.1 15.3 15 2 1.3 1.5 16.8 16 3 2.0 2.3 19.1
17 3 2.0 2.3 21.4 18 2 1.3 1.5 22.9 19 4 2.7
3.1 26.0 20 5 3.4 3.8 29.8 21 2 1.3 1.5 31.3
22 7 4.7 5.3 36.6 23 3 2.0 2.3 38.9 24 1 .7
.8 39.7 25 5 3.4 3.8 43.5 26 6 4.0 4.6 48.1
27 6 4.0 4.6 52.7 28 7 4.7 5.3 58.0 29 2 1.3 1
.5 59.5 30 2 1.3 1.5 61.1 31 6 4.0 4.6 65.6
32 5 3.4 3.8 69.5 33 3 2.0 2.3 71.8 34 4 2.7 3
.1 74.8 35 4 2.7 3.1 77.9 36 3 2.0 2.3 80.2
37 7 4.7 5.3 85.5 38 4 2.7 3.1 88.5 39 2 1.3 1
.5 90.1 40 3 2.0 2.3 92.4 41 3 2.0 2.3 94.7
43 1 .7 .8 95.4 45 1 .7 .8 96.2 46 1 .7 .8 96.
9 47 1 .7 .8 97.7 48 2 1.3 1.5 99.2 52 1 .7
.8 100.0 Total 131 87.9 100.0 Missing System 1
8 12.1 Total 149 100.0
131 defendants who took MMPI and SIRS
148
52.7 of cases are 27 or lower 47.3 of cases
are 28 or higher What is the base rate of
feigned psychopathology?
Scale Cutoff FPR TPR
F GTE 28 .043 .635
149
(No Transcript)
150
BR TPR FPR NPP PPP
What we say Within our sample of 131
defendants, the BR of feigned psychopathology is
.73 (NOT .475) At BR .73, the PPP of F GTE 28
is .976. At BR .73, the NPP of GTE 28 is .492,
so p(feigning if LTE 27) is still .508)
(Remember, theyre being given the SIRS for a
reason)
151
F lt 28
NPP about .66
152
F gt 27
153
Application of MGV to a CGV estimation of FPR
and TPR
154
Greve, Bianchini, Love, Brennan, Heinly (2006)
articulated six separate groups with increasing
base rate of malingering based on formal criteria
for malingering (the Slick criteria) to validate
the MMPI-2 Fake Bad Scale
  • No incentive (no evidence of external incentive
    and no test
  • performance suggestive of malingering n 18,
    mean FBS 15.4)
  • Incentive (external incentive, but no test
    performance suggestive
  • of malingering n 79, mean FBS 19.5)
  • Suspect (external incentive and at least one
    indicator suggestive
  • of malingering n 66, mean FBS 22.7)
  • Statistically Likely (external incentive at
    least two indicators
  • suggestive of malingering n 51, mean FBS
    22.8)
  • Probable (external incentive strong indicators
    of malingering
  • n 31, mean FBS 26.9)
  • Definite (external incentive very strong
    indicators of
  • malingering n 14, mean FBS 29.8)

155
  • Even though it is clear that
  • BR Definite gt BR Probable gt BR Statistically
    Likely gt
  • BR Suspect gt BR Incentive Only gt BR No Incentive
  • They were required, to conduct Known groups
  • validation, to ignore this obvious circumstance
    and to define
  • BR No Incentive BR Incentive Only 0
  • BR Statistically Likely BR Probable BR
    Definite 1.0
  • And drop all participants defined as Suspect
  • to yield the following ROC

156
FBS ROC generated by Known groups validation
by Greve Bianchini

157
If we had estimates of the BR for each of the
subgroups formed by Greve and Bianchini, we
could use MGV to estimate FPR and TPR for each
potential cut score.
158
We have our stable estimate of TOMM FPR and TPR
159
TOMM No simulation studies FPR .056, SE
.025 TPR .742, SE .093
160
We can get estimates of BRs for those groups
from other work by Greve Bianchini. They
formed similar groups using the Slick criteria
to investigate the TOMM. We can use the
proportion of positive TOMMs in each of these
subgroups to estimate the BRs in each of them.
161
From Greve, Bianchini, Doane (2006)
Proportion Positive TOMM Scores Proportion Positive TOMM Scores
No Inc 0
Inc Only 5
Suspect 20
Probable 47
Definite 78
162
Proportion Positive TOMM Scores Proportion Positive TOMM Scores Est BR of malingering
No Inc 0  0
Inc Only .05  0
Suspect .20  .21
Probable .47  .633
Definite .78  1
163
We take these BR estimates and reapply them to
the Greve Bianchini FBS data.
164
Example of MGV for FBS based on BR estimates for
Greve Bianchini groups established by Slick
criteria
Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering Base Rate of Malingering
0 0 .21 .633 1
Pr Tests .11 .09 .23 .52 .79
n 18 79 66 31 14
At FBS gt 27
For FBS gt 27, using WLS Regression, FPR .091,
TPR .773
(For WLS, n is the weighted variable)
165
(No Transcript)
166
Evaluate constructs that underliesymptom
validity tests
167
10 clinical studies using Rey 15-Item Test No
simulators All clinical data
168
RMT validating using MGV with clinical probability
judgments. FPR .025 TPR .574 Frederick
Bowden, 2009
169
(No Transcript)
170
CI, TPR .574, SE .044 We will generate
TVS based on these values and find PPP and 1
NPP to estimate probability of bad
intent represented by RMT score.
171
(No Transcript)
172
Validity Indicator Profile
173
(No Transcript)
174
VIP Verbal Subtest Items
  • Easy Baby Drink Infant
  • Moderate People Ally Folk
  • Difficult Nimiety Conceit Surfeit

175
Suppression
176
Not guessing, knowledgeable responding
Guessing
Guessing is imminent
Easy items
Difficult items
177
Inconsistent curves
178
527 criminal defendants who took RMT and
VIP concurrently Rate of positive scores in
this sample was .113 PPP .814 1 NPP .077
179
Here we are matching VIP categories to the
construct most likely captured by the
VIP. Points in scatterplot represent groups of
25 individuals. Sorted defendants by clinical
ratings of malingering, then took 20 groups of
25 and one group of 27, for 527 defendants.
BR of .42 estimated for this group is mean of PPP
for positive RMT scores in this group and (1
NPP) for negative RMT scores in the group
180
Same 21 subgroups, N 527 defendants
181
527 criminal defendants VRIN was converted to
probability of invalid responding by dividing
VRIN raw score by 12. VRIN raw scores gt12 were
assigned p 1. We are interested in FPR and
TPR for Invalid
Write a Comment
User Comments (0)
About PowerShow.com