Title: The end of construct validity
1The end of construct validity
- Denny Borsboom
- University of Amsterdam
2Two kinds of validity
- The working researchers idea Validity concerns
the question of whether a test measures what it
should measure - The construct validity idea Validity is an
evaluative, integrated judgement of the degree to
which test score interpretations are justified in
the light of empirical evidence and theoretical
rationales (and, possibly, social consequences
that follow from test use)
3What I will argue
- The working researchers conception is
theoretically and practically superior - The construct validity position has some
sophication but that is mainly windowdressing in
general, it precisely misses the point of what
validity is
4The pillars of construct validity
- Construct validity is
- an evaluative judgement
- about test score interpretations
- in terms of constructs
- that is a function of evidence
- and a matter of degree
- I will argue that this view
- does not align with the working researchers view
at all - has quite unreasonable consequences that one
should not be comfortable with
5(No Transcript)
6Why construct validity theory is dysfunctional
7The social consequences of construct validity
theory
8The social consequences of construct validity
theory
9A black hole that traps all psychometric problems
10Why construct validity has nothing to do with
tests (and why this is wrong)
11Every interpretation can have construct validity
12There as as many construct validities as there
are judges
13Measurement instruments can become valid
14Some measurement instruments were valid...
15...but then ceased to be valid...
16Reference is unimportant
Aether
DNA
Black hole
Phlogiston
17Validity depends on the presence of interpreters
18How construct validity is sold
- Construct validity is an evaluative, integrated
judgement of the degree to which test score
interpretations are justified in the light of
empirical evidence and theoretical rationales
(and, possibly, social consequences that follow
from test use)
19What construct validity really is
- Somebodys evaluative, integrated and fluctuating
judgement of the degree to which test score
interpretations, that may have nothing to do with
measurement, are justified in the light of
time-dependent empirical evidence and that
persons theoretical rationales (and, possibly,
that persons guesses about social consequences
that follow from test use as well as his or her
valuation of these outcomes)
20Why all this sophistication misses the point
21- Construct validity is an evaluative, integrated
judgement of the degree to which test score
interpretations are justified in the light of
empirical evidence and theoretical rationales
(and, possibly, social consequences that follow
from test use) - However, validity is...
- a property, not a judgment
- a property of instruments, not of inferences
- a function of truth, not of evidence
- the object of validation research, not its result
22VALIDITY
23A simple alternative
- A test is valid for measuring an attribute if and
only if variation in the attribute causally
produces variation in the measurement outcomes
24(No Transcript)
25(No Transcript)
26Attribute structure
27Attribute structure
28Score structure
Attribute structure
29Score structure
Response process
Attribute structure
30IQ-scores
82
134
70
115
99
Response process
g
31X
IQ-scores
82
134
70
115
99
Response process
f(X ?)
g
?
32Substantive theory
Formal model
X
IQ-score patterns
Response process
f(X ?)
g
?
33Substantive theory
Formal model
X
IQ-score patterns
?
Response process
f(X ?)
?
g
?
?
34Where to look for validity
- Traditionally, evidence for validity is sought in
external relations relations between test scores
and other test scores - In criterion validity the evidence comes from
correlations with a criterion (or with the
criterion) - In construct validity, the evidence comes from
correlations with lots of other variables (MTMMs)
35.09
.15
.56
Attractiveness
Working memory
Extraversion
.55
Masculinity
.40
Race
Visual memory
.35
IQ-scores
Job performance
Annual income
.30
.37
.41
Annual income
Sex
.50
Numerical ability
SES
.20
.78
Physique
Length
Genetic differences
But even if we knew all correlations between all
conceivable tests, the validity problem would
remain
36Where to look for validity
- Validity is not a matter of external relations
between the test scores and other test scores - It is a matter of which processes take attribute
differences into response differences - For many tests we have no idea of what happens
between item administration and item response - This is the reason that the validity problem has
proven hard to crack -
37Where to look for validity
- Ingredients for validity
- A theory on the structure of the attribute
- A theory on the processes that take levels of the
attribute into observed score patterns - A formal model to test the theory against data
- The question of validity then becomes is this
theory true?
38Example The balance scale test
Weight item
Distance item
What happens when the blocks are removed ?
Conflict Weight item
39Example The balance scale test
- Theory on the structure of the attribute
- Cognitive development involves an ordered series
of discrete transitions between stages - Theory on the processes that take levels of the
attribute into observed score patterns - Children in different developmental stages use
different cognitive rules to solve balance scale
items, which results in different response
patterns - Statistical model to test the theory against data
- Developmental stages are conceptualized as latent
classes with theoretically driven response
vectors
40Balance scale Test scores
X
001100
111100
001100
110011
110011
Response process
Rule 1
Rule 2
Rule 3
P(Xx ?)
Developmental stages
Latent classes
41- The question of validity
- Is this theory of response behavior correct?
42How does this relate to other issues?
- The validity concept is usually applied to many
questions simultaneously - Does the test measure the intended attribute?
- How well do the test scores predict other
attributes? - Is the use of the test legally defensible?
- Will using the test improve the human condition?
- which are put under one umbrella I only deal
with (1) - (2-...) are better left to psychometrics, law,
politics, etc.
43(No Transcript)
44Does this mean that other issues are unimportant?
- No. Interpretations, uses, and consequences
matter a great deal - But they are not thereby issues of validity
- Moreover, they usually belong in the public
sphere, not in the domain of validity theory
45Bottom line
- To find out what you measure, you have to find
out how your instrument works - there is no other
way - If you know how your instrument is supposed to
work, and you know how it works, you have a
definite answer to the validity problem - However, if dont know how your instrument is
supposed to work, and you dont know how it
works, you are in trouble
46(No Transcript)
47Validity is...
...measuring the right thing