Title: Abstract
1Joni L Mihura
Joni L Mihura
The Validity of Psychological Tests as Measures
of Aggressive Behavior A Review of the
Literature E.M. Farrer J.L. Mihura University
of Toledo
Abstract
Method cont.
Results cont.
This study reviews the empirical literature on
the validity of psychological tests as measures
of aggressive behavior. The psychological tests
were categorized into two groups (a) self-report
questionnaires (e.g., BDHI, JI, PAI) and (b)
performance personality tests (e.g., Rorschach
and Hand Test). For criterion variable of
aggressive behavior, only studies using
observational measures are included in the review
(e.g., ward reports, patient records, chart
reviews). The effect sizes of the psychological
tests compared to observational measures are
presented and then compared using a
monotrait-multimethod approach. Also of interest
are similar studies using a multitrait-multimethod
approach, comparing and contrasting similar
constructs (e.g., aggression, anger, antisocial
behavior) using the same or different methods
(self-report measures, performance personality
tests, and observational measures). The goal of
the study is therefore twofold (1) to review the
literature on the validity of psychological tests
as measures of aggressive behavior and (2) to
place this aggression literature in a
psychometric context regarding more general
issues of monomethod-heteromethod approaches to
validity.
The observational measures needed to be clear in
how they measured the aggressive behavior.
Records of past aggressive behavior (e.g., chart
reviews, criminal file reviews) also had to have
a well-defined way of measuring aggressive
behavior including objective systems like the
number of institutional infractions for forensic
samples. Only studies written and conducted in
English with no clear criterion contamination,
(e.g., behavior ratings blind to psychological
test data) were included. For comparative
purposes, the findings are reported in effect
sizes, converted where necessary to use Pearson r
as the common metric. As a rule of thumb, the
magnitude of effect sizes (r) can be classified
as (a) small .10, (b) moderate .30, and (c)
large .50 Figure 1. Multitrait-Multimetehod
Table
Table 2. Aggressive Behavior as Measured by
Performance Personality and Self-Report Tests
Summary Statistics
Measurement Method k N rw
Performance Personality Test 3 246 .31
Self-Report Tests 5 498 .27
Total Tests 8 744 .29
Note k number of effect sizes included in the
summary statistic
Construct Overlap Construct Overlap
Measurement Method Moderate High High
Different E.g., Self-report anger measure compared to aggressive behavior Major Study Question Self report or performance personality aggression measures compared to aggressive behavior Major Study Question Self report or performance personality aggression measures compared to aggressive behavior
Same E.g., self-report anger measure compared to self-report aggression measure E.g., Self-report aggression measure compared to self-report aggression measure E.g., Self-report aggression measure compared to self-report aggression measure
Table 3. MTMM Results Weighted Mean Effect Sizes
Introduction
Construct Overlap Construct Overlap
Measurement Method Moderate High High
Different .16 (k 8, N 924) .28 (k 9, N 814) .28 (k 9, N 814)
Same .46 (k 6, N 1,771) .77 (k 4, N 371) .77 (k 4, N 371)
Most often in psychology the general notion of a
persons level of functioning and personality
aspects is obtained by the word of mouth of the
person him- or herself. Ways this can be done is
by using self-report measures, such as the
Personality Assessment Inventory (PAI), or by
performance personality tests, such as the
Rorschach. These measures, however, rely heavily
on the respondent as the source of information,
whereas behavior measures rely on others as the
source of information.
2 Discussion
Many self-report measures used for screening are
broadband inventories such as the PAI or the
Jesness Inventory (JI). Several are also
specifically designed to measure the construct of
interest. The construct of particular interest
to this review is aggression. Aggression can
be defined as the act or practice of attacking
without provocation, (Coccaro et al., 1997).
Aggression can be verbal or physical and, for
this study, directed outwardly. The reliance on
self-report measures and performance personality
tests of aggression is of particular interest due
to the implications that could arise if the
aggression is carried out. How well can
self-report measures and performance personality
tests designed to measure aggression actually
predict aggressive behavior? Further,
aggression also has similar constructs with
similar implications. Anger and antisocial
behavior are among those. How well do tests
specifically measuring those related constructs
predict aggressive behavior? Also, how well do
the same constructs measured by different methods
compare? This multitrait-multimethod approach is
of particular interest to the study. According
to Campbell and Fiske (1959) the same construct
measured by different methods should agree and
should agree better than different constructs
measured by different methods. Thus, the study
has two goals. The first is to review how
self-report and performance personality measures
of aggression compare to observable behavior.
The second is to compare similar but slightly
different constructs to themselves and each other
using the same and different methods.
Results
For studies that reported more than one effect
size, these were were averaged to report as one
effect size per study. Table 1. Aggressive
Behavior as Measured by Performance Personality
and Self-Report Tests
According to Jacob Cohen (1988), when one
looks at near-maximum correlation coefficients,
of personality measureswith real-life criteria,
the values one encounters fall at the order of r
.30. This corresponds to the findings above.
The values were obtained using personality
measures and with the same or highly overlapping
constructs as compared to the real-life criteria
in question. Self-report and performance
personality measures do not differ in their
effect sizes either. For the Campbell and
Fiskes (1959) MTMM approach, the data correspond
quite well. Measuring moderate construct overlap
using different methods will result in low effect
sizes. On the other hand, using high construct
overlap and the same methods, the correlation is
quite high and what one would expect for
test-retest reliability. This also corresponds
with Meyer et al.s (2001) findings that a single
measure will only represent a certain portion of
ones personality and that different sources of
information tend to provide their own unique
interpretation of someones personality or
behavior. Future information is yet to come.
While performance personality tests and
self-report measures were compared to each other,
to behavioral measures, and to themselves the
next step would be to see how well observational
measures compare to themselves.
Study Study Measures Measures Sample Sample Sample N r
Performance Personality Tests Performance Personality Tests Performance Personality Tests Performance Personality Tests Performance Personality Tests
3. Rorschach AgC Rorschach AgC Rorschach AgC C C C 94 .27
14. Hand Test AOSACT-MOV Hand Test AOSACT-MOV Hand Test AOSACT-MOV MR MR MR 36 .57
13. Hand Test AOSACT-MOV Hand Test AOSACT-MOV Hand Test AOSACT-MOV MR MR MR 116 .27
Self-Report Tests Self-Report Tests Self-Report Tests
18. PAI AGG PAI AGG PAI AGG F F F 169 .26
4. Aggression Questionnaire PA Aggression Questionnaire PA Aggression Questionnaire PA S S S 91 .33
17. BDHI Assault BDHI Assault BDHI Assault F F F 60 .26
15. BDHI Assault BDHI Assault BDHI Assault C C C 51 .40
7. PAI AGG-P PAI AGG-P PAI AGG-P F F F 127 .20
References
Method
Note C Clinical MR Mentally Retarded F
Forensic S Student
- Campbell, D.T., Fiske, D.W. (1959). Convergent
and discriminant validation by the
multitrait-multimethod matrix. Psychological
Bulletin, 56, 81-105. - Cohen, J. (1988). Set correlation and contingency
tables. Applied Psychological Meaurement, 12(4),
425-434. - Meyer, G.J., Finn, S.E., Eyde, L.D., Kay, G.G.,
Moreland, K.L., Dies, R.R., Eisman, E.J.,
Kubiszyn, T.W., Reed, G.M. (2001).
Psychological testing and psychological
assessment. American Psychologist, 56(2),
128-165. - See Handout for the List of Reviewed Studies.
Table 1 shows the effect sizes for performance
personality tests and self-report measures as
compared to behavior that range from .20 TO .57.
Summary statistics were also computed for
self-report and performance personality test
effect sizes. This was done by taking the mean
of each grouping of tests weighted by N. As
shown in Table 2, both performance personality
and self-report tests had overall medium effect
size relationships with aggressive behaviorr
.31 and r .27, respectively. The next table
shows the results from the question of what
happens to the effect sizes when slightly
different constructs are measured using different
methods and when the same constructs are measured
by the same methods. Again, the effect sizes in
the table are weighted to compensate for the
varying sample sizes.
Studies were located by conducting a PsycINFO
search of articles published within the past 30
years with either Aggressive Behavior or
Antisocial Behavior or Violence as Subject terms.
These were further limited by a classification
code of personality scales and inventories or
clinical psychological testing. The articles
were limited due to the high volume retrieved
without the classification code19,651. The
limit reduced the number of articles to 387.
These remaining articles were kept or eliminated
based on the following criteria. The tests in
the study had to contain a self-report or
performance personality measures of aggression,
anger, or antisocial behavior. The next
criterion for the study was an observational
measure used that could be correlated with the
self-report or performance personality test.