Title: Bez tytulu slajdu
1Lecture 10
STRATEGIES OF TEST DEVELOPMENT
2References
- Murphy, K. R. Davidshofer, C. O. (1998).
Psychological testing. Principles and
applications International Edition (6/e). Upper
Saddle River, N.J. Prentice-Hall, Inc. (CHAPTER
11).
3Strategy of test development - definition
The strategy of test development means a
well-planned and carefully realized procedure
aimed on maximalizing the validity of the test.
The basis of all strategies are questions
concerning existence of given psychological
phenomena.
4Sorts of strategies
?Deductive (theoretical). ?Inductive
(internal). ?Criterion-oriented (external).
5Aspects of strategy
- The theoretical basis of the strategy
- Crucial phase of the strategy
- Psychometric model applied
- Final psychological, diagnostical and
psychometric properties of the test - The examples of the instruments
6Theoretical basis of the strategy
?Deductive (theoretical) strategy is based on the
theory. This strategy previously was called as
rational. The goal of this strategy is
diagnosis of the traits, described by a
theoretical concept (traits are known from the
begining), so the whole procedure sould lead to
confirm theoretical expectations and instrument,
assessing a given set of psychological constructs.
7Theoretical basis of the strategy
?Inductive (internal) strategy is based on
methodology. The goal of this strategy is
diagnosis of the traits, which explain a set of
behaviors - those traits will be known on the end
of this process). The whole procedure sould lead
to discovery of those set of traits and
instrument, which enables to assess them.
8Theoretical basis of the strategy
?Criterion-oriented (external) strategy is based
on knowledge, concerning the psychological
syndrome related to a given criterion. The goal
of this strategy is diagnosis of the type of
personality (or syndrome of traits), which
diagnoses (or prognoses) of criterion. The whole
procedure sould lead to conform expectations
about this syndrom and instrument, which enables
to assess it.
9Special assumptions of the external strategy
- Syndrom of traits diagnoses (prognoses) the
criterion. - Syndrom of traits may be assessed by the
inventory. - Inventory is used to diagnose criterion and -
indirectly - to identification of this syndrom of
traits.
10Crucial phase of the strategy
?Deductive (theoretical) strategy -
representative pool of the items, designed to
measure an a priori choosen traits. In this
strategy procedures of content validity
assessment are applied and items are originally
written out.
11Crucial phase of the strategy
?Inductive (internal) strategy - very broad and
representative for all forms of human behaviors
the pool of items (otherwise some traits may not
be covered). In this strategy the banks of items
are applied (based on vocabularies or other
instruments, like Cattelian 16PF).
12Crucial phase of the strategy
?Criterion-oriented (external) strategy - broad
and specific for the given syndrome only pool of
the items (otherwise some aspects of the syndrom
may not be covered or other syndromes will be
also diagnosed). In this strategy typically items
are borrowed from other instruments (like
clinical inventories from MMPI).
13Psychometric model
?Deductive (theoretical) strategy item-total
correlation, leading to selection of the items,
which are the best measures of the assummed trait
(assessed by the whole set of the
items). ?Criterion-oriented (external) strategy -
item-criterion correlation, leading to selection
of the items, which are the best measures of the
criterion.
14Psychometric model
?Inductive (internal) strategy exploratory
factor analysis, leading to selection (on the
basis of factor loadings) of the items, which are
the best measures of traits. Factor analysis is
used to discover the structure of traits and the
set of the items measuring those traits.
15Final psychological properties of the instrument
?Deductive (theoretical) strategy promotes the
theory development or may lead to reformulation
(or even falsification) of the theory. ?Inductive
(internal) strategy may be the basis for the
new theory or leads to simple description of the
personality.
16Final psychological properties of the instrument
?Criterion-oriented (external) strategy
enriches the knowledge about personality
syndromes and provides a very useful instruments,
but leads to theoretical chaos (thousands
personality constructs violence of the Ockham
razor rule).
17Final diagnostic properties of the instrument
?Deductive (theoretical) strategy instrument
diagnoses traits from some domains of personality
(typically not the whole personality). ?Inductive
(internal) strategy instrument diagnoses basic
dimensions of personality (very general
traits). ?Criterion-oriented (external) strategy
the tool diagnoses the syndrome of traits (the
personality type).
18Final psychometric properties of the instrument
?Inductive (internal) strategy scales are very
economical (short) with high reliability, very
high construct-oriented validity (MT-MM matrix,
factor analysis) and low criterion-related
validity (in the case of multiscales instrument
acceptable criterion-related validity of the
whole inventory).
19Final psychometric properties of the instrument
?Criterion-oriented (external) strategy scales
of very high criterion convergent validity, but
unpredictable discriminant validity of other
criteria scales are not economical (very long)
with low reliability and low construct-oriented
validity (factor analysis).
20Final psychometric properties of the instrument
- ?Deductive (theoretical) strategy scales of
intermediate properties Economical with
sufficient (but not very high) reliability and
criterion-related validity acceptable
construct-oriented validity, especially
convergent aspect, but weak discriminant one
(MT-MM matrix, factor analysis).
21Conclusions
- Each strategy has its advantages and
disadvantages for the specific purposes a given
classical strategy may be applied even in modern
psychology. - It is suggested a golden way of test
development mixed or sequential strategies
(which preserve the strong aspects and eliminate
the weak points).
22Lecture 11
THE PROCEDURE OF TEST DEVELOPMENT
23References
- Murphy, K. R. Davidshofer, C. O. (1998).
Psychological testing. Principles and
applications International Edition (6/e). Upper
Saddle River, N.J. Prentice-Hall, Inc. (CHAPTER
10-11).
24The general properties of the procedure of test
development
- The process of test developement is
longlasting and contains several phases. For
these reasons it should be very carefully planned
and realized. Some phases of the procedure are
common for all testst, but most of them are
specific for the chosen strategy of test
development.
25Phases of test development
Phase I the choice of the strategy and the plan
of constructing the test. Phase II analysis of
the theoretical basis of the instrument
theoretical constructs, methodology or knowledge
about the syndrom and criterion. Phase III
process of generating of the items (forming the
pool of the items). Phase IV writing down the
test items.
26The properties of the good personality test item
- The good personality test item should be
- Comprehensible
- short one
- kept in simple and positive grammatical form no
negations and double negations, simple in
grammatical form, - written using well-known expressions
- in Polish female and male forms
- Unambigous with regard to the content
- Free from social desirability
27The properties of the good personality item
- The good personality test item should have
- Extended format of options of the items
- optimal number of cathegories of answers from 4
up to 7 - An appropriate scoring rule of the answer numbers
with regular intervals - the best solution using natural numbers,
anchored at zero
28Phases of test development
- Phase V - pilot testing of the items
- Content validity (in the case of the theoretical
strategy) and - comprehensibility,
- ambiguity,
- social desirability checked in the itemmetric
analyses.
29Phases of test development
Phase VI preparation of the test to empirical
studies Title of the test, instruction and
random order of the items in inventories, and
progressive order of the items with regard to
their difficulty in ability tests. Phase VII
investigations of the sample of examinees and
psychometric analyses of the data, leading to the
final form of the test.
30Sample requirements
- Sample size - sufficiently large 5-10 examinees
per item (psychometrics does not tolerate small
samples). - Sample composition Heterogenous with regard to
gender, age, education level, job, region of
residence, etc.). - Number of samples At least two fully comparable
samples, designed to cross-validation
31Item analysis a definition and indices
Item analysis psychometric analysis, which
comprises item difficulty, item-total correlation
(discriminative power) and external validity of
the item. Item-total correlation indicator of
the correlation between the item and the test as
a whole indicates how well the item measures
the trait as the whole test does.
32Item analysis a definition and indices
Item difficulty proportion of the correct
answers on the item (or the mean of item scores
in relation to maximal possible score in the
scale of answers). External validity of the item
correlation between criterion and the item
indicates how well the item diagnose (or
prognose) the criterion.
33The item-total correlation
- Biserial correlation coefficient
- Point-biserial correlation coefficient
- Phi coefficient
- Corrected-item total correlation (CITC).
34Corrected item-total correlation coefficient
Corrected item-total correlation (CITC)
coefficient obtained by analysis of variance (as
simple effect) it indicates correlation between
the item and total test score without the score
of given item (there is no auto-correlation
effects) ? correlation between the item and the
sum of other items in the scale.
35External validity and factor validity indices
External validity correlation with criterion
(Pearsons coefficient or phi coefficient). Facto
r validity of the item factor loading on own
factor and other factors.
36Phases of test development
Phase VIII test norms and publications of the
test (handbook). Phase IX revision of the test.
37The content of the test handbook
- Theoretical basis of the test
- Procedure of test development
- Reliability and validity indices (with sample
characteristics) - The description of test administration and rules
of scoring and interpreting the test scores
(standarization and objectivity) - Norms (tables of norms).
38Revision of the test
The test should be revised after some years of
use - the new form of the test should be
developed. The revision should be done on the
items level and the test test as a whole. The
revision should be done after about 20 years of
use the test in psychological practice, due to
the changes in culture or weak points of the test
(constructor has a results from many studies,
necessary to conclusions about desired test
properties).
39Example development of the FCB-TI
- Stage I
- Regulative Theory of Temperament separaction
and analysis of energetic and temporal
characteristics of temperament - Stage II
- Generation of items (students, N15
psychologists, N5) - Writing the items introductory pool of 600
items - Stage III
- Linguistic assessment of items (students, N30)
linguist)
40Example development of the FCB-TI
- Stage IV.
- Evaluation of content validity
- Sorting the items into scales (experts, N12)
- Evaluation of the items within scales (experts,
N9) - Sorting the items into subscales (experts, N12)
- ? reduction to 392 items
41Example development of the FCB-TI
- Stage V.1
- Evaluation of formal characteristics of the
items itemmetric analyses (N334) - reduction to 381 items
- Stage V.2
- Research on preferred format of answers (N334)
42Example development of the FCB-TI
- Stage VI.
- Final preparation works on the inventory title,
instruction, items order 381 items - Stage VII.
- Administration of the questionnaire 2 samples
development sample (1012) test sample (1011)
43Example development of the FCB-TI
- Stage VIII. Psychometric studies on the inventory
- Construction of subscales basing on corrected
item-total correlation - Factor analysis separate for energetic and
temporal characteristics - Construction of scales reduction from 12 to 6
scales, selection of best items with regard to
corrected item-total correlation - ? Reduction to 120 items (6 scales)
44Example development of the FCB-TI
- Stage IX.
- Verfication of psychometric properties of the
questionnaire - Distributions of test scores
- Internal consistency 5 samples 1?
N828, 2 ?N475, 3 ?N527, 4? N392, 5?
N407) - Temporal stability (with 2 weeks 6 months
interval
45Example development of the FCB-TI
- Stage X.
- Evaluation of test validity
- 1.
- 2.
- 3.
- 4.
- 4.
- 5.
- 7.
46Schematyczna procedura konstrukcji inwentarza
Formalna Charakterystyka Zachowania -
Kwestionariusz Temperamentu (FCZ-KT)
- Badanie trafnosci pomiaru FCZ-KT
- 1. Analiza porównawcza diagnozy cech temperamentu
za pomoca samoopisu i szacowania - 2. Analiza porównawcza diagnozy cech temperamentu
za pomoca samoopisu i metod laboratoryjnych
(wybrane wskazniki psychofizyczne i
psychofizjologiczne) - 3. Analiza relacji cech temperamentu do innych
biologicznych wymiarów osobowosci - 4. Badanie zwiazków pomiedzy cechami temperamentu
a zdolnosciami - 5. Analiza funkcjonalnego znaczenia temperamentu
- cechy temperamentu a adaptacja zawodowa i stan
zdrowia - 6. Analiza relacji pomiedzy struktura
temperamentu a stylami zachowania wiazacymi sie z
ryzykiem choroby psychosomatycznej - 7. Badanie struktury temperamentu u osób chorych
47Example development of the FCB-TI
- Stage XI.
- Normalisation
- Stage XII.
- Preparation of test manual
48Assumptions of the Classical Test Theory and Item
Response Theory
The relationship between the item score and the
test score is Linear Classical Test Theory
(might be assessed by the simple
correlations) Curvilinear Item Response
Theory (might be assessed by the curvilinear
regression).
49Fig. 28. The item characteristic curve (ICC)
50Parameters of the item characteristics curve
- Parametr of difficulty
- Parametr of discrimination
- Parametr of quessing
- Parametr of carrelessness
51Parameters of the item characteristics curve
- Difficulty - intensity of the trait, necessary
to have a chance higher than 0.50 for correct
answer - Discrimination - the slope of the curve, near
by the difficulty - Guessing - the difference of probability from
zero among low test scorers
52Parameters of the item characteristics curve
- Carrelessness - the difference of probability
from one among high test scorers. - Typically are two-parameter models are applied
(difficulty and discrimination).
53Fig. 29. ICCs for different items
54Fig. 30a. ICC for item no 35 from ER scale
(FCB-TI inventory)
55Fig. 30b. ICC for item no 102 from ER scale
(FCB-TI inventory)