Title: Personality
1Personality
- First text published by G. Allport in the 1930s
- First theoretical models date back to William
James and Sigmund Freud (both late 1800s) - Considerable variability in explanations for
personality (biological/genetic models,
psychodynamic, self-theory, etc.) - Common Elements
- Stability (situations and time) differentiated
from mood - Research on stability over the lifespan (greater
as we age) - Affective, cognitive and behavioral components
2Personality assessment
- Recent survey of practicing Ph.D.s, PsyD.s, and
Ed.s revealed that only 32 use personality tests
and only 43 do treatment planning. - De-emphasis in personality training occurred at
the same time as Mischel shock in 1968, so
clinicians trained in the late 1960s and 1970s
did not value personality assessment - Today, treatment planning based on assessments
is essential from both an ethical standpoint and
for insurance reimbursement
3Objective assessments?
- How can personality assessment be more objective
- assess any biases and correct for them (lie,
defensiveness) - find a method to avoid such biases
- look for convergence with reports from others
- assess with low face valid instruments and look
for consistent patterns (though this only really
addresses intentional faking) - Personality assessment is used to further
describe the client, just as a diagnosis does
(note that you would not say that depression is
causing the patient's behaviors, you merely use
the term to summarize a cluster of behaviors. The
diagnosis itself also does not necessarily imply
a causal mechanism nor an explanation - those
from different perspectives would define it
differently) - e.g., if someone is depressed it could be
explained biologically, cognitively,
behaviorally, or even in psychodynamic terms
4The structure of personality
- Personality involves stable patterns of
behavior, affect, and cognitions. So how stable
is stable? (states vs. traits) - Levels of analysis
- 1. factors - groups of traits that show better
global predictive utility (e.g., Big 5 of N, E,
O, A, C The Big 3 of N, E, P Big 2) - 2. traits - clusters of consistent individual
behaviors - 3. habits - consistent (over time) individual
behaviors - 4. single acts - individual behaviors
- All levels are used to predict future behavior
with the top being the most robust - Consider this model when recommending or
implementing change in clients
5Predicting behavior
- Difficult to predict specific single behaviors
from global trends (Epstein, 1983) - For clinical evaluations, if the context of
interest is known, then you may want to trade off
the generalizability and give a specific
prediction - e.g., Pt.s test scores indicate that he is
generally impulsive. This may be exacerbated when
in the company of other individuals who are also
impulsive and when the individual is drinking, as
alcohol minimizes any inhibition processes that
he might have. This substantially increases the
likelihood that he will act impulsively when...
6Two key discussions
- Read material in advance and know your MMPI
- Scheduled discussion
- Should we use projective tests?
- Are they tests or techniques?
7Assessing Axis I and II
- Personality addresses both AXIS I and AXIS II
disorders. - What are some AXIS I disorders that might be
related to personality traits? e.g., - depression and NA/Neuroticism
- anxiety and NA/neuroticism
- impulse control disorders extraversion/sensatio
n seeking - AXIS II personality disorders explicitly link up
with personality assessments (video DSM-IV) - Cluster A (odd) Paranoid, Schizoid, Schizotypal
- Custer B (emotional) ASPD, Borderline,
Histrionic, Narcissistic - Cluster C (anxious) Avoidant, Dependent,
Obsessive-Compulsive - PD NOS features of several Dx,but does not
meet criteria for any one.
8Selecting a test battery (see Beutler, 1995)
- What is the referral question?
- Single most important determinant
- Are there any limiting factors with regard to the
client? - Context of the evaluation? (work, school,
hospital, etc.) - Follow up assessment relevant to trait findings
(e.g., patients who show impulse control problems
should also be assessed for potential for acting
out violently) - Problem focused or broad, multipurpose battery
- Nomothetic (allows for normative evaluations) or
ipsative (allows for the evaluation of the
individual) analysis
9If using qualitative methods, consider
- 1. Method appropriateness are there
quantitative methods that you could use instead? - 2. Openness make clear the theoretical
orientation that undergirds the qualitative
assessment - 3. Theoretical sensitivity use qualitative
methods that are based on accepted theories not
your own theories 4. Bracketing of expectation
you must explicitly state where your
conclusions depart from accepted theories 5.
Responsibility how were the qualitative methods
administered and interpreted - 6. Saturation/generalizability when assessing
traits, sample from a large number and wide range
of situations 7. verification of methods
cross-validate your methods using other reports,
other test material to see if it agrees with your
conclusions, do findings predict outcomes, etc.
10If using qualitative methods, consider (cont)
- 8. grounding stay close to the data when making
interpretations (no big theoretical leaps) - 9. coherence do all of the interpretations fit
together to make a coherent story - 10. believability/usefulness does the use of
the qualitative method provide more info on the
client, or just raise more questions? Does it
result in a believable narrative? - 11. Intelligibility Is the report readable and
jargon free?
11MMPI (Hathaway McKinley, 1943)
- 10 clinical scales and 3 validity scales
- Empirical scale development with items selected
based on their ability to differentiate normals,
from a target group (another clinical group with
similar symptoms was sometimes also employed) - Clients should be 18 or older 6th grade
education - Generally lower face validity (breaks with
tradition of items that clearly sample the domain
of interest) most relevant for clinical
population
12MMPI development
- Item pool derived from psychological and
psychiatric reports, textbooks, previous scales,
etc. - Criterion group composition
- Minnesota normals 724 relatives and visitors of
patients at the U. of M. Hospitals, 265 recent
high school grads, 265 administration workers,
and 254 medical patients - Clinical groups 221 patients representing the
major psychiatric categories (excludes those with
multiple diagnoses, or questionable diagnoses) - Item analysis to identify those items
differentiating the clinical and normal groups
13MMPI development cont.
- The items that could differentiate were then
cross validated with new groups of normals and
patients - Later developed two non-clinical scales
- M/F initially to identify male homosexuals was
augmented with broader items - Si derived from an introversion/extraversion
scale and cross validated by predicting
involvement in college activities in a second
sample (all female college students) - Validity scales were either derived rationally (L
K) or from baserates in the normal group (F)
14Utility of the MMPI
- Not considered a diagnostic inventory (as was
originally intended) - Ineffective at differential diagnosis (based on
how it was originally developed) - Numerical scale labels was intended to further
minimize the connection with a specific
diagnostic label
15Some problems with MMPI
- Method of determining the criterion group
- The PIGs were not a truly random group (relatives
and friends of those in the hospital though
largely the medical patients) convenient - Criterion and PIGs were largely from the midwest,
in the late 1930s/early 1940s - Utility of some of the scales as it matched
diagnostic concerns of that era, dated and
culture-specific item content, and
representativeness of the norm group.
16MMPI vs. MMPI-2 (1989)
- MMPI was the most widely used personality test in
all pops (though only validated for inpatient
adult samples) - MMPI validation and norm samples were ones of
convenience with limited variability on education
(M8 years), coming from a rural background in
the midwest - Normative data collected in the 1930s
- Clinical cut-off now defined by t-score of 65 vs.
70 on the MMPI
17MMPI vs. MMPI-2
- Advantages of updating the test
- more representative norms (based on projected
census data) - relevance of the items
- language employed for the items (both temporally
laden references like drop the hanky, and
gender biases in item content) - addition of new scales of relevance today
- Uniform T-score transformation now used so that
T-scores reflect percentile ranks that are the
same across all clinical scales
18MMPI vs. MMPI-2
- Disadvantages to all updates
- over 20,000 published studies no longer apply
- MMPI-2 must revalidate all of the scales
- inability to make comparisons with adolescent
scores (MMPI-2 vs. MMPI-A) - Many of the new scales are very short and lack
appropriate psychometric properties - How often should we redevelop or renorm the scale?
19MMPI-2 (1989) 567 items
- Norm group 2,600 community based subjects
- 1138 m 1462 f, aged 18-85 (M41, SD15.3),
education 3 yrs - 20, 61 married median incomes
25-35,000, 3 of m and 6 of f receiving mental
health treatment - 81 Caucasian, 12 A-A, 3 Hispanic, 3 Native
American, 1 Asian-American
20Validity scales
- Assumption that the clinical population will not
be able to answer forthright - Lie naive or unsophisticated lying (low SES and
education) - K less obvious (high SES and education)
defensiveness is a component of all responding - F answering questions in such a way so as to be
different from 90 or more of the population
(non-normative responses) See fake bad/fake good
profiles - F K Index can be used to indicate fake bad,
with larger numbers making it more likely (little
evidence to suggest that fake good can be
detected) see p. 38
21Clinical Scales
- 1. Hs - exaggerated concerns re physical
illness, or tendency to report symptoms - 2. D - Clinical dep unhappy pessimistic about
the future - 3. Hy - conversion reactions (substitute illness
for emotions) - 4. Pd - History of delinquency, antisocial
behavior (non-conventional re moral standards)
22Clinical scales - continued
- 5. Mf - prototypical gender identity (military
recruits, stewardesses, homosexual males
students) - 6. Pa - paranoid symptoms (ideas of reference,
persecution, grandeur) - 7. Pt - anxious, obsessive-compulsive, guilt
ridden, self-doubts - 8. Sc - thought disorder, perceptual
abnormalities (various types of Schiz.)
23Clinical Scales - continued
- 9. Ma - exhibition of mania, elevated mood,
excessive activity, distractibility, (possible
manic-depression or BP II) - 10. Si - college students scoring in the extreme
range on introversion - extra. - Costa McCrae (1990) suggest that the MMPI-2
wont work in the normal pop. As people dont
respond passively to items
24New Validity Indexes
- Basic validity comes from L, F, K
- VRIN (variable response inconsistency)
- 47 pairs of items that should be answered
similarly or the opposing direction. Client gets
a point for each inconsistent response. - A completely random response set results in T
scores of 96 for m and 98 for f (gt80 inval.) - acquiescent responding T 50
25New Validity cont.
- TRIN (true response inconsistency)
- 23 pairs of items that are opposite in content
- either T/T or F/F to assess acquiescent or
non-acquiescent responding - larger raw scores true responding while smaller
raw scores false responding - raw scores should be between 6 and 12 in order to
consider the profile valid - Fb - back infrequency items for latter part
26Coding the Profile
- List scale codes in order of their T-score
elevations (from highest to lowest) - usually only interpret 4 scale codes and order
does not matter - Welsh coding system involves adding symbols to
numerical scale codes - e.g., L F K 1 2 3 4 5 6 7 8 9
0 - T 57 75 43 69 88 75 94 52 81 75 79 59 65
- Welsh 4268371095 FLK
27Codes (listed to the right)
- 100-109, 90-99, 80-89, 70-79, 65-69,
-60-64, /50-59, .40-49, 30-39 - Some coding forms use ! to denote scores of
110-119 and !! for 120 or greater - Underline identical T-scores (and list in
ascending order) as well as those within one
point of each other - e.g., 4268371095/ FL/K.
- Code Types 2,3 and 4 point codes 5 point diff
between lowest code T and T of highest scale not
in the code.
28MMPI-2 practice case M.S.
- Integrate the MMPI-2 data with the client
information (vs. laundry list). Note profile
valid. - e.g., profile 3-2/2-3 should revolve around the
discussion of depression and the manifestation of
symptoms (physical symptoms tend to be
substituted) - How does this relate to M.S.?
- Recent loss, seeing her physician, isolation
- What does the 8 (or 2-3-8) tell you?
- How might psychotic symptoms relate to M.S.?
- Confusion from malnutrition, confusion as a
result of depression, her age re dementia? All
are possible
29M.S. - continued
- Include discussion of (or section on) prognosis,
recommendations, and diagnosis - Axis I 296.24, Major depression, single episode,
with psychotic features - AXIS II No diagnosis (or deferred)
- AXIS III Malnutrition, dehydration, poor hygiene
personal care - AXIS IV Death of spouse (Severity extreme
(acute event) - AXIS V GAF Current, 24 highest past year, 52
30MMPI-2 with other pops.
- MMPI was originally developed using Caucasian
groups of patients - Although some research has shown mean score
differences between majority and minority groups,
this is less relevant to the issue of whether
there is differential predictive validity (few
studies on this) - Hall, Bansal, Lopez, 2000, have conducted a
meta-analysis of 30 years research on minority
groups and the MMPI (both versions)
31Hall et al., 2000 - summary
- AA first note that cultural identification
moderates all findings (cf. acculturation) - Inconsistent findings re mean differences, with
F, 8, 9 sometimes higher by approximately 5
T-score points - Many matched grouped studies of patients have
found no differences, though Ns were small
(meaning what?) - Generally no differences in predictive validity
that achieve statistical or clinical significance
and any differences can be attributed to SES and
age - MMPI-2 has representative norms
- Minimal information on the supplemental scales
and even less for the content scales
32Hall et al., 2000 sum cont
- Hispanics likewise show few differences from
Caucasians - Possible differences for scales 3 and 0, with
Hispanics scoring higher on 3 and lower on 0, but
these effects were small with minimal clinical or
statistical sig. - Much stronger effect for acculturation in this
ethnic group - Few studies on Native Americans, but they show
this pop. to score slightly higher on most scales - Few studies for Asian Americans, and they show
slight elevations for scales F, 2, 8. - Generally valid to use for these pops given
appropriate acculturation and understanding of
the language
33Other populations
- Given its original construction, there should be
no problems using the MMPI in medical settings - Medical problems do not necessarily result in
higher scores (i.e., more distress) - In substance abuse settings, no profile emerged
to detect substance abuse, but scale 4 was a good
predictor (see also the supplemental scales) - We will discuss forensic applications later in
the semester (see chapter 13) - MMPI-2 can be used in non-clinical settings to
screen for psychopathology, but there are some
concerns. - False positives are more common
- Has not been validated to predict success in
other settings (e.g., jobs) which is true of most
personality tests (predict interest)
34MMPI-A (1992)
- Do we need a different inventory for adolescents?
Why? Scales of concern? - M/F for adolescents may be less defined
- Theoretically Pd is thought to be elevated, but
actually it tends to be lower - Personality is less stable overall so we need
different norms to better interpret scores and
relevant items for this age group - Valid for those aged 14-18 (for 18 y.o., the
decision is based on life circumstances e.g. at
home? working?) - Important to score on both adult and adolescent
norms as there can be substantial differences
(T-score shifts of 15 points) - 478 items (some new some from the original
inventory) - written auditory forms both in English and
Spanish
35MMPI-A
- Includes all of the clinical, some new
supplemental content scales. So we use
basically the same scales but different
descriptors (i.e., a high score on Hs will not
mean exactly the same thing for the MMPI-A e.g.,
Pd equates more with acting out) - Biggest change was with the F scale since it is a
norm defined scale (we need new norms) - Norms 805 boys 815 girls aged 14-18 solicited
randomly from schools in 7 states. Represents the
U.S. for SES and ethnicity (again minimal diffs
for ethnicity) - Change from MMPI which had separate norms for
different adolescent age groups (now only one) - F scale now has 2 parts F1 1st part of test,
F2 2nd part (Ftotal)
36MMPI-A New scales
- New Supplemental scales
- Alcohol/drug problem proneness (PRO)
empirically derived to assess the likelihood of
alcohol or other drug problems. Items
differentiate adolescents in tx from those having
other psychological problems - Alcohol/drug problem acknowledgement (ACK) face
valid items that reflect the admission of
problems - Immaturity (IMM) reporting behaviors,
attitudes, and perceptions that reflect
immaturity (e.g., poor impulse control, judgment,
and self-awareness). Items predict academic
problems and cognitive limitations. - Check for diagnoses such as oppositional-defiant,
conduct disorder, and in adulthood ASPD
37MMPI-A Psychometrics
- For the most part, the psychometric properties of
the MMPI-A are sound. The reliability values are
lower than the MMPI-2 values, but still within
acceptable limits. - Why might there be less temporal stability in the
MMPI-A? - General interpretative data from the MMPI-2 can
be generalized to the MMPI-A, but this data
should be considered in light of the clients
position in life (i.e., consider how the scores
relate to school life, problems with parents,
need for independence, etc.) - Note no K-correction for clinical scales even
though a defensiveness score is calculated. So
what are the clinical scale implications for a
high K?
38MCMI-III (Millon, 1990)
- 175 item scale assessing problematic personality
styles and classic psychiatric disorders (drawn
from the DSM) - In contrast to the MMPI, this scale was derived
theoretically to match the nosology (taxonomy) of
the DSM to facilitate diagnosis and intervention
planning. Assumes that any assessment is theory
driven (vs. MMPI which tried to be a theoretical) - The theory is grounded in evolutionary principles
assessing 4 spheres existence (from serendipity
to an organized structure), adaptation
(survival), replication (reproductive styles that
maximize diversity), and abstraction (the
emergence of competencies to foster planning). - Scored according to a polarity model. e.g., self
vs. other orientation (reproduction), pleasure
vs. pain (existential, or aim of, existence) - Illustration Schizoid is marked by deficits in
both pleasure and pain as indicated by the lack
of emotion and apathy
39MCMI-III properties
- A brief inventory (175 items) that takes only 30
minutes to complete - 3 modifier scales that correspond to the validity
scales - Disclosure defensiveness
- Desirability favorable response set
- Debasement lying
- 11 clinical personality patterns schizoid,
avoidant, depressive, dependent, histrionic,
narcissistic, antisocial, aggressive (sadistic),
compulsive, passive-aggressive, self-defeating - 3 scales denoting severe personality patterns
schizotypal, borderline, paranoid - 7 clinical syndromes anxiety, somatoform,
bipolar, dysthymia, alcohol dependence, drug
dependence, PTSD - 3 severe syndromes thought disorder, major
depression, delusional disorder
40MCMI-III- continued
- Scales interpreted based on base rates for each
dx and it assumes that disorders are
interconnected (consistent with comorbidity data) - Initial studies had classification rates of 90,
but follow-up studies have been much lower (50
or less) - Validity data has been equivocal and the
reliability data is likewise lower than the
MMPI-2 (these are related, and both linked to
number of items)
41CPI (Harrison Gough)
- Developed at the same time as the MMPI and served
as the personality test for the normal population
(MMPI for the clinical pop.). Drew from a similar
item pool. - 480 T/F questions (some overlap with MMPI and
others are new) - Emphasizes more positive/normal aspects of
personality - 3 validity scales well being (normals asked to
fake bad), good impression (normals asked to fake
good), communality (popular/obvious responding
that may reflect defensiveness and conformity) - 15 general scales assessing a wide range of
traits such as intellectual efficiency, capacity
for status, achievement via conformity - Grouped into 4 quadrants (factors) Norm favoring
vs. norm doubting and externalizing vs.
internalizing
42CPI - continued
- CPI was revised in 1986 with norms based on
13,000 males females - Most commonly used personality inventory overall
- It has been replaced by the NEO-PI as most common
in the last 15 years. - Psychometrically sound (reliability and validity
coefficients are high and stable for different
pops), but a very long instrument. - Also some question as to the need for validity
scales in the normal pop. - Burisch suggests this is unnecessary provided 1)
no reason to lie, 2) knowledge of the
construct(s), and 3) self awareness.
43NEO-PI (Costa McCrae, 1985, 1992)
- Based on the empirically derived 5 factor model
- Assumption that 5 factors can represent all of
normal personality - Evaluated this model in a variety of contexts,
with samples from all over the world and in
different languages - Assumes that language is the best place to start
examining how to describe behavior (132 Eskimo
words for snow indicates it is a meaningful
construct) - Neuroticism (emotional stability), extraversion,
openness to new experience, agreeableness
(quality of interactions) and conscientiousness
(dutiful, organized). - 5 factors have been recovered from other
inventories like the Myers-Briggs, 16PF, etc.
44NEO-PI
- Full version is 220 items and has 6 facets for
each of the 5 factors - Short form (NEO-FFI) has 60 items and provides
factor scores only - Norms are available for adults, college students
and adolescents (though minimal differences
between the latter two groups) - Strong psychometric properties including very
stable retest coefficients, internal reliability,
and validated with other personality scales. - Can be used to predict job interests (though
vocational inventories such as the Strong
Interest Inventory are better suited for this),
but they do not predict job success (same is true
for interest inventories) - Often used for intuitive purposes and not
empirically validated purposes (e.g., assume that
a manager should be low on N and high on C vs.
empirically testing this assumption with current
managers)
45Structure of affect and other issues
- Big two (PA/NA) vs. 5 factor
- Bipolarity of affect (vs. orthogonality)
- Temporal question for what defines affect vs.
personality - Problem of temporal language (e.g., at this
moment)
46Measures of Affect
- Note The EPI (Eysenck) likewise measures
personality (extraversion and neuroticism) in the
normal population, and these two factors are
usually the first two to emerge in factor
analysis. - These factors correspond to the Big Two affect
constructs (PA and NA) - Note most of these measures do not address
validity of responding - Nevertheless, research suggests that these scales
tend to be fairly accurate and reflect actuarial
rates for affective disorders (5-9 of adult
women and 2-3 of adult men) - BDI published in 1961 and revised in 74, 78,
and 96. - Among the most commonly used inventories with a
comprehensive manuals published in 1987, 1993,
and 1996 (BDI-II) - Normed for adolescents and adults aged 13 and
older. 21 items with items arranged in a Guttman
approach (increasing order of severity) - Suicide potential in items 2 and 9. For dx of
Depression see neurovegetative items
47BDI - continued
- Internally consistent and reliabilities range
from .48 to .86 for periods ranging from several
hours to four weeks - Why are retest coefficients smaller?
- No way to correct for faked scores
- Validated extensively for use in clinical
settings - BDI-II validated on 500 outpatients drawn from
across the country and a student sample of 120 - 1 week retest was .93 and coefficient alphas were
.92 or higher - Average BDI-II scores are 3 points higher than
the original BDI - BDI-II time frame for each item focuses on last
two weeks to match the DSM criteria
48BAI (Beck Steer, 1993)
- 21 item symptomatic inventory
- Items rated on a 0-3 scale
- Validated for use for inpatient (N 1,086),
outpatient (N 160) and college student samples
(N65). - Shows convergent validity with other measures of
anxiety and some disciminant validity with
depression measures (though they are correlated
sharing 10-25 variance) - Rapid self-report tool
49CES-D (Radloff, 1977)
- Developed by NIMH for use as a screening tool in
the general population (also in college and
geriatric pops) - Optimal test for this purpose in this population
- 20 likert type items focusing on the last week
- Better than the BDI-II at differentiating among
those experiencing lower levels of depression - Internal consistency is high (.85 in general pop.
and .90 in patient samples). - Retest figures tend to be low (.48) but this is
less relevant for this construct - A score of 16 is clinical cutoff and it assesses
depressed affect, positive affect, somatic
activity, and interpersonal functioning
50MAACL-R (Zuckerman Lubin, 1985)
- Originally published in 1965 and revised in 85.
(132 checklist type items) - Normed on over 1500 adults, 400 adolescents
(approx. 90 Caucasian, 10 Black) - Scores for Anxiety, Depression, hostility, PA,
and SS (the latter has very poor internal
reliability) - A rapid assessment but not as good
psychometrically - Can be used to evaluate states or traits and
reliability figures are better (though not very
high) for the latter - Scales dont corr with social desirability and do
converge with MMPI ratings
51Behavioral Assessments
- Assumption behaviors can reflect cognitions and
emotions (e.g., FACS Ekman Friesen, 1978) - Proliferation of behavioral assessments with
limited validity due to the assumption that
behavior can be easily defined and that it
represents a meaningful (typically underlying)
construct e.g., sweating, pacing - How to improve behavioral assessments?
- Identify the actual behavior being assessed (lip
turned downward vs. sadness) - Habitual behaviors may indicate underlying
condition - Acknowledge role of both traits and situations
52Beh assessments cont.
- Also influenced by factors such as social
desirability (varies depending if one is aware of
the assessment) - Difficult to organize and systematize behaviors
(e.g., how does one smile equate with the absence
of a frown re depression?) - Very inconsistent findings regarding the
organization of individual behaviors (even
physical symptoms) via F.A. - Why might self-report and behavioral assessments
not overlap? What does this mean? - Recall behavioral reactivity phenomenon change
in behavior as a function of its assessment
53Physiological measures
- Some people want to fill the world with silly
physiological measures. And what's wrong with
that? (McCartney et al., 1976) - Biofeedback long history but very mixed
findings - Plethysmography changes in blood volume that
may relate to emotional changes - Pupillary responses attraction and fear?
- Polygraph arousal related to lying?
54Cognitive testing refresher
- WAIS-III score interpretations for reports
- With regard to the index scores, which declines
the most with age? - Quick, its PS!
- Which show the greatest decrements secondary to
organic dysfunction (trauma or disease)? - PS, WM, and PO Depends on the area of the brain
that is damaged. If diffuse, then all three. If
temporal then WM, if more right hemisphere then
PO. - Which is the best indicator of premorbid
functioning? - VC (or subtests of vocabulary, similarities
info.)
55Cognitive and personality functioning
- What are meaningful ways to integrate these two
pieces of information? - What interpretations might one make for high IQ
individuals relative to low IQ individuals re
personality? - Overlap with maturity? Less complex
presentations? - What PD is associated with extremist thinking
(splitting), inability to recognize subtleties? - Other implications?
- Ease of use for clients, alternative test format,
wider range of responses (variability),
alternative approach to detecting pathology,
difficult for client to identify socially
desirable or undesirable responding, theory based - Defensiveness strategies (see MMPI-2)?
56Projective test/technique
- MMPI/MMPI-2 is most frequently used test in
inpatient settings - Rorschach TAT are not too far behind
- Advantages of projectives?
- Disadvantages of projectives?
- Administration and scoring is generally less
standardized so reliability and validity are
compromised
57Minimal criteria for a test
- Standardized administration
- Rorschach has numerous administration procedures
(Bleck, Klopfer, Exner, etc.) - Standardized scoring
- Rorschach has numerous scoring approaches (Bleck,
Klopfer, Exner, etc.) - Standard of comparison for interpretations (norm
group) - Minimal information with regard to representative
norms
58Exners scoring system
- Location part of the blot
- W, D, d, S, (WS)
- How common is the location (normative comparisons
from manual) - Determinant what led to response
- Form, Color, FC or CF, Movement, etc.
- Evaluate form quality (normative decision based
on manual of responses). Low F psychosis/poor
reality contact - Content focus on what specifically
- Human or animal, whole or detail, nature, etc.
- Populars determines normative responding
59Rorschach Exner
- Exners (1987) scoring system involves an attempt
to increase validity by objectifying the scoring,
increasing the number of responses (14), and
standardizing the administration - This has resulted in significant improvements in
the tests reliability and validity - In a meta-analysis, Hiller et al. (1999) found
the Rorschach (using Exners scoring) to have
larger validity coefficients than the MMPI-2 for
studies using objective criterion variables
60Other projective tests
- TAT (Thematic apperception test, Murray)
- Stimuli are less ambiguous than the ink blots
- Tell a story, though little standardization re
which pictures to be used, scoring (typically a
content analysis), etc. - Used extensively with less literate pops like
children (CAT), geriatric pops (GAT), non-English
speaking individuals, etc. - Draw-a-figure test (figure drawings)
- Person, family, house, tree, etc. all are
interpreted as you - Minimal standardization for scoring
- Sentence completion
- Sentence stems like Mom is, Life, etc.
largely scored for a thematic standpoint - Bender-Gestalt (the same test used for
neuropsychological screens) - Copying figures and making personality
interpretations
61Test or technique?
- Review articles and come up with an opinion. Come
ready to debate/discuss. - On Tuesday.
62Assessment of malingering
- What is malingering? What must it include?
- Intentional? Awareness? Personal gain?
- Very complex phenomenon that may change over time
- e.g., A lie (or lies) that become real/true for
the individual over time, or a truthful statement
that becomes a lie. - Most statements cant be categorized as one or
the other, and typically involve aspects of both - Berry et al (1995) suggest that faking good and
faking bad are distinct constructs (not opposite
ends of the same continuum) - Harder to detect specific faking vs. general
faking - Content nonresponsivity (CNR) random
responding, all true or all false - Content response faking (CRF) fake good or bad
research suggests that these may be independent
dimensions (client may fake good on some parts
and fake bad on others) - Should always be considered (in some form) when
there are contingencies for the patient
63Classifications of Misrepresentation
- Are symptoms under conscious control? Are
physical/psychological symptoms motivated by
internal or external gains? - Factitious Disorders intentional production of
symptoms (feigning) that are motivated by
internal gains - Motivation is to assume the sick role as there
are no external incentives for the behavior
(e.g., economic gain, avoiding legal
responsibility, etc.) - Somatoform disorder unintentional (i.e.,
unconscious) production of symptoms for internal
gains - Malingering intentional production or
exaggeration of symptoms (i.e., conscious)
motivated by external incentives - Lack of cooperation during the evaluation,
presence of ASPD, discrepancy between
self-reported data and objective findings,
medicolegal context for referral (e.g., attorney,
police, etc.) - Note Exaggeration rather than fabrication makes
differential very difficult
64Pros and Cons of Malingering Dx
- What are the costs of labeling someone a
malingerer - Questions all present and future clinical
presentations - What are the limits of our measures to make this
differential? - After weighing the strength of any claim of
malingering (relatively weak given the limits of
our measures) and the costs of making an
erroneous judgment, we need to act very carefully - Use converging, independent evidence to make any
determinations - e.g., objective inventories like the MMPI-2,
strong contextual factors (i.e., to provide the
motive and baserates), interview, low probability
baserates for responding (e.g., incorrect on all
options when this would be well below chance
responding), and response to the evaluators
feedback (e.g., Actually, youre doing quite
well followed by decrements in performance)
65Mind of a murderer the Bianchi tapes
- Identify the circumstances that could be seen as
contingencies for malingering (reinforcers for
malingering) - Why would that particular malingering behavior be
manifested? - How could client have obtained the information
necessary to provide the malingering profile? Any
evidence that this information was obtained? - Any indications of malingering in his
presentation? (Be objective) - What are some reasons why he might not be
malingering? - Predict response sets in advance of testing (vs.
scoring in hindsight) - What pattern of responses do you predict for the
Rorschach? - What pattern of responses would you predict for
the MMPI-2? - Whats your call?
66Measures of malingering Berry et al
- The pasta strainer and photo copy machine
incident - MMPI-2 F, F-K (note these two indices are not
independent), VRIN (random), TRIN (all true or
all false), and Fb - Also look for discrepancies between some of your
subtle and obvious supplemental scales (though
this can also just assess sophistication in
malingering) - The D scale has also been used with some success,
as the items appear to reflect a less
sophisticated (popular) view of mental illness - MCMI evaluates random responding, low frequency
responding, willingness to disclose information,
debasement (willingness to endorse psychological
problems), and desirability (unwilling to endorse
psychological problems). Also as with the D scale
of the MMPI, the well-being scale can likewise
assess psychopathology
67Measures of malingering 2 continued
- CPI (Cough, 1957) intended to assess
personality in the normal population - Has 3 validity scales good impression (faking
good), communality (items with either very high
or very low endorsement frequency that assesses
random responding), well-being (assesses fake
bad) - Basic personality inventory (BPI Jackson, 1989)
contains 12 scales each with 20 T/F items.
Research is limited on its utility for this. - Deviation scale is comparable to the MMPI-2 F
scale - Personality assessment inventory (PAI Morey,
1991) is a 344 items - 4 validity scales Inconsistency, infrequency,
negative impression management and positive
impression management - NEO-PI-R (Costa McCrae, 1991) no effective
validity index, so should not be used in this
context - 16 PF also lacks adequate validity measures and
should not be used
68Measures to specifically detect malingering
- These measures should be administered when the
referral question specifically implicates
malingering and/or when there are substantial
contingencies to suggest that malingering is
likely - Structured Interview of reported symptoms (SIRS)
- Has shown some promise, though it is susceptible
to acquiescence and false positives (claiming
malingering when it is not) - The M test is a 33 item T/F test with three
scales genuine symptoms of schizophrenia,
atypical attitudes not characteristic of mental
illness, and bizarre and unusual symptoms rarely
found in mental illness - Showed some ability to differentiate patients
from directed malingerers and from suspected
malingerers (Note The problem with using the
latter criterion group as there is no definitive
knowledge about those individuals)
69Measures to specifically detect malinger. - 2
- Test battery approach including WAIS-III and the
MMPI-2 the more tests administered, the harder
it is to present a consistent profile - This approach should use baserates for incorrect
responses as the primary means of classifying
(see also TOMM) - Provide response options (typically no more than
two) such that a chance correct criterion can be
calculated (e.g., 50 for a two item version)
this should be no lower than 30 to avoid floor
effects - Track responses over at least 30 trials (the more
the better as this minimizes chance outcomes). - Calculate the probabilities for deviations from
.50 correct and apply it to clients correct
response rate (i.e., what are the odds that they
would have missed as many as they did if they
were truly guessing) - Evaluate responsiveness to your feedback (e.g.,
Youre actually not doing that bad vs. Most
people with your type of injury do better) - If less sophisticated malingering there will be
an immediate and relatively large response to
your comments
70Who is your client?
- Why is this question important in addressing the
malingering issue? - If the suspected malingerer is your client who is
undergoing therapy with you (or someone else) to
whom is your obligation and what are the
costs/benefits of undertaking an evaluation of
malingering? - Does it help the therapeutic process? Focus on
why one might be deceptive to better understand
clients behavior - If the client is the court, then to whom is
your obligation and what are the costs/benefits
of undertaking an evaluation of malingering? - Question now is to determine if client is being
deceptive/evasive.
71Assessing psychopathic personality
- Psychopathic personality behavior characterized
by remorseful and callous disregard for others
and a chronic antisocial lifestyle. Thus, most
ASPDs are not necessarily psychopathic. - Drawing data from various sources (at least
three) - In person interview
- Testing
- Independent historical information (anything that
is not self report it is important to note that
other official records are not necessarily based
on anything other than self-report) - Although all three of the above are important in
order to provide converging evidence, the test
data will be the strongest tool in court (due to
its psychometric strengths)
72Assessment (Meloy Gacono, 1995)
- The Psychopathy checklist revised (Hare, 1991)
20 item test with a 4-point Likert scale
response format. Largely intended for males
(little data on females) - To be completed by the clinician after a clinical
interview and review of historical data (includes
descriptors falling under a single dimension of
psychopathy) e.g., impulsive, irresponsible,
shallow emotions, etc. - Items must be scored in a particular sequence,
with more structured items first, followed by the
least structured items (with the former
contributing to the latter) - Cutoff score of 30 or greater to define
psychopathy, with higher scores denoting more
extreme presentations - Adequate reliability and validity, though note
the overlap between some of the validity criteria
and the info used to determine the score (e.g.,
extent of criminal record is used for both)
73Assessment (Meloy Gacono, 1995) p. 2
- The Rorschach should still pursue the minimum
number of responses (14 or more) as suggested by
Exner (1986) - Include an assessment of defenses and object
relations (both of which appear to have modest
reliability) that suggest more narcissism
(self-references), violations of boundaries, etc.
in the psychopathic personality (specific ratios
from Exners scoring system are described) - MMPI-2 primary focus is on scale 4 (also
content subscales drawn from 4 be cautious with
the latter) - If administering scale 4 alone, note that you
will not have the benefit of the k correction.
Thus, scores will be suppressed. - L and F will also predict psychopathy (tendency
to be untruthful) - Cognitive abilities (e.g., WAIS-III) are
unrelated to the presence of psychopathy, but may
be informative as to the nature of the
presentation (e.g., level of sophistication,
concordance with traditional/normative concepts
of intelligence, etc.)
74Integrity testing
- Evaluating integrity as a trait, whereas such
behavior may be situation specific (e.g., someone
who would not lie in interpersonal settings might
not hesitate to cheat on their taxes). - Characterological view of integrity downplays
situational factors - Integrity is a very broad concept that can
include diverse responses (e.g., passive vs.
active lying, cheating vs. theft, etc.) - Early paper and pencil tests were validated with
the polygraph - Employed in low end entry jobs when people have
to interact with money (retail, financial
services, etc.) - Today, such tests attempt to predict a wide range
of behaviors including violations of work rules,
fraud, absenteeism, etc.
75Integrity testing p. 2
- Overt integrity tests evaluate beliefs about
the incidence of theft and other
counterproductive behaviors, punitive attitudes
towards theft, endorsement of common
rationalizations for theft, and direct questions
about ones own involvement in such activities. - Personality oriented measures much broader than
integrity tests and tend to have lower face
validity (e.g., high conscientiousness on the
NEO) - Clinical measures like the MMPI validity scales
- All are difficult to validate because the
behavior we are trying to predict goes largely
undetected. So if a test score does not predict
it could just mean that this is a false positive
or someone who was not caught
76The polygraph test
- Measures physiological arousal that is presumed
to be associated with lying. e.g., perspiration
as indicated by galvanic skin response, brain
activity suggesting arousal, etc. to the question
(not answer) - Is this assumption reasonable?
- Confounds?
- Under what circumstances can lying not be
associated with arousal? - Habituation effect from repeated lying?
- Lack of awareness of the lying? (issue of
conscious vs. unconscious) - What is the best way to quantify arousal? Should
we evaluate this normatively or ipsatively? - Control Question Test (CQT) compares relevant
questions to control questions which are intended
to elicit a strong physiological response from
innocent subjects (e.g., Prior to 1993, did you
ever do anything that was illegal or dishonest?) - While innocent people know they didnt commit the
crime, they are either uncertain or lying about
the CQ. Guilty persons should not respond as much
to the CQ
77The polygraph test p. 2
- Criticisms of the CQT
- Difficult to develop good control questions that
will produce similar responses relative to
relevant questions for innocent people. This
results in many false positives (Note Bias for
positive outcome is why most of these tests have
artificially high success rates in forensic
settings most are guilty) - CQ are designed for each individual, so
standardization is compromised - Direct Lie Control Test (DLCT) if person
answers truthfully to a question they are asked
the question again and told to lie about it when
asked again (a known lie for comparison) - Can be standardized and the power of the DLCT is
from the instruction (which is standardized) not
the content of the question - Can reduce the rate of false positives and
generally does better than the CQT - Initially employed absolute standards for arousal
lying and this was not at all effective
78The polygraph test p. 3
- The guilty knowledge test (GKT) not designed to
detect deception, rather it tries to
differentiate between those who have knowledge
about a particular event (crime) and those who do
not (the innocent) - The concealed information test (CIT) is similar
to the above approach and likewise tries to
assess familiarity with specific information as
opposed to lying - Both of these approaches have the advantage of
asking the exact same questions of all
individuals and comparing responses both within
and between subjects - Minimal data on these approaches, as the bulk of
the research is on the CQT
79Does it work?
- Honts (1994) reviewed the literature on the
effectiveness of the polygraph and found that it
does about as well as chance in experimental
settings. Most of the reviewed research uses the
DLCT - In real life and experimental settings, the
majority of errors are false negatives (saying
someone is innocent when they are guilty) - Most deceptive individuals (up to 95) are
misclassified - Because the cost of a false positive (saying
someone is guilty when really they are innocent)
is deemed to be higher in our legal system.
Therefore, the cutoff scores (criteria) have been
altered so as to make false negatives more likely - Why does it fail?
- If high arousal to control questions, then more
difficult to discriminate - Idiosyncratic responses to lying
80Admissibility of the polygraph (Saxe
Ben-Shakhar, 1999)
- Courts have almost universally rejected the
polygraph, though this question has been and
continues to be litigated extensively - Courts are increasingly being made responsible
for evaluating the merits of test data, despite
lacking the expertise to do so. - Note The literature has become increasingly
discrepant in its view on the polygraph
(disagreement on its validity even in the
scientific community) - What criteria should be used to evaluate this
information and what should we tell the courts? - History
- Marston (1917) used a blood pressure cuff to
determine truthfulness (arousal) in a defendant
(Frye), based on the assumption that while truth
required little or no energy, lies do rejected
by the courts
81History of the Polygraph
- Note the courts use of the term experimental as
not well established evidence - The Frye ruling adequately reflects the courts
treatment of the polygraph even today, though now
based on the Federal Rules of Evidence (FRE)
which require that the evidence (polygraph or
otherwise) be relevant and that it aid the jury
(i.e., be valid). - Daubert (1993) was based on the FRE and
highlights 4 considerations when ruling on
evidence - Testability or falsifiability (see Popper and the
method of science) - Error rate
- Peer review and publication
- General acceptance
- This basically requires juries judges to
evaluate scientific issues
82History of the Polygraph p. 2
- In trials like Daubert, scientists with opposing
views on the polygraph present their views and
the jury must decide on the merits of their
arguments - Generally there has been no legal distinction
between the concepts of reliability and validity
(you can see where this is go, since, from a
scientific standpoint, reliability limits
validity) - An additional problem with these concepts is that
the data is collected as a series of discrepancy
scores and these are then summed to reflect a
qualitative assessment of truthful, deceptive,
and inconclusive. Thus, very different
discrepancy readings might still result in
similar qualitative assessments. - Two accepted approaches for reliability are
- Test the same person twice on the same issue
using the same polygraph technique with 2
different testers - Test the person once, but have the chart scored
by two different people
83History of the Polygraph p. 3
- The latter approach deals on with the error
involved in chart scoring and ignores (or
equates) administration error - The real issue is whether the procedure as a
whole is reliable (e.g., the creation and
administration of control questions), thereby
getting at internal reliability (do different
parts of the test agree), test retest reliability
(different administrations of the test agree),
inter-rater reliability (different test
administrators agree as to the outcome) - Note There are practical limitations to how
often the same test could be given to the same
individual - What little data exists on reliability focuses
only on the between examiners approach
(inter-rater reliability), though this
reliability is reasonable (not high). Thus, this
remains an unevaluated component of the polygraph
(major limitation)
84History of the Polygraph p. 4
- Because the courts do not distinguish between
reliability and validity, the minimal reliability
that does exist carries far more weight than it
should. - Modern views of validity highlight the
integrative component of validity (recall
Messick, 1995), though to evaluate it, it is
necessary to consider different aspects
separately - Different types of validity are more relevant
depending on the question at hand - e.g., predictive validity for integrity testing
in job placement/hiring, vs. criterion validity
being more relevant for determining truth/lying - Construct validity gets at the theoretical issue
of what is a lie. Is it a situational phenomenon
or a trait? Can it be represented by
physiological responding? Etc. - No theory to explain why a stronger response
should occur for lies vs. truth
85History of the Polygraph p. 5
- Similar physiological responses to lying appear
to occur for experiences such as surprise/novelty - Note For the CQT, questions about the crime are
expected to be well rehearsed for the criminal - Thus, they have questionable construct validity
(not necessarily measuring what they propose to
measure) - Under-represents the construct of interest and
over-represents irrelevant constructs (surprise,
stress, etc.) - What criterion can be used?
- Outcome of a trial? If the case is dismissed?
- Do either of these assure that we know the
clients status re lying? - Note also that a true evaluation of the polygraph
would mean that the examiner only has access to
the polygraph data (that s never the case).
86History of the Polygraph p. 6
- The criterion and predictor are rarely
independent. - e.g., if the polygraph is used to get a
confession and the confession helps get a
conviction, then by definition, the polygraph is
part of the criterion (polygraphs are frequently
used to get confessions) - Experimental criteria for the polygraph generally
lack external validity (is lying in an experiment
to lying in a crime involving yourself? That
is, are all types of deception equal?), while
real life evaluations of the polygr