Title: 46-320-01 Tests and Measurements
146-320-01Tests and Measurements
2Writing Items
- DeVellis (1991)
- Define
- Item Pool
- Avoid long items
- Appropriate level of reading
- Avoid double-barreled items
- Mix positively and negatively worded items
- Cultural/ethnic sensitivity
3Item Format
- Dichotomous format
- Two alternatives
- Pros Ease of construction and scoring, absolute
judgment - Cons memorization, chance of being correct
4Item Format
- Polytomous format
- More than two alternatives
- Pros less chance guessing, fast time,
distractors - Corrected scores
- Guessing?
5Item Format
- Likert format
- Degree of agreement
- Five alternatives vs. six
- Reverse scoring
- Category format
- 10-point scale why 10?
- Remember context
- Visual Analogue scale
- 100 cm line
6Item Format
- Checklist
- Usually adjectives
- Q-Sort
- Increases options (9)
- Form normal distribution
7Item Analysis
- Purpose shorten a test and increase reliability
and validity - Item difficulty
- Proportion who get the item correct
- Probability of chance
- Optimum level
- Variable difficulty (0.3 to 0.7)
- Internal criteria test score
8Discriminability
- Extreme group method
- Discrimination index
- Negative discriminator
- Point Biserial method
- Small test n
- Higher correlation, better the item
9Discrimination
Item U (20) M (20) L (20) Difficulty (UML) Discrimination (U-L)
1 15 9 7 31 8
2 20 20 16 56 4
3 19 18 9 46 10
4 10 11 16 37 -6
5 11 13 11 35 0
6 16 14 9 39 7
7 5 0 0 5 5
10Table Explained
- Class n 60
- Discrimination rough index U L
- Item Difficulty U M L
- Items
- 2 too easy
- 7 too difficult
- 4 5 negative discriminative value
11Further Item Analysis
Response Options Response Options Response Options Response Options Response Options
Item Group 1 2 3 4 5
2 Upper Lower 0 2 0 0 0 1 20 16 0 1
4 Upper Lower 0 2 10 16 9 2 0 0 1 0
5 Upper Lower 2 1 3 3 3 3 11 11 2 2
7 Upper Lower 5 0 3 5 5 8 4 3 3 4
12Discrimination Index Percentages
Percent Passing Percent Passing Index of Discrimination (D)
Item Upper Lower Index of Discrimination (D)
1 75 35 40
2 100 80 20
3 95 45 50
4 50 80 -30
5 55 55 0
6 80 45 35
7 25 0 25
13Item Characteristic Curve
- X axis total test score (trait estimate)
- Y axis proportion of test-takers with the item
correct - Often use class intervals
14Discriminability
15Item Response Theory
- Each item has an item characteristic curve
- Specific range of difficulty can be identified
with a test characteristic curve - Difficulty and discriminability
- Sample items
- Peaked conventional vs. rectangular conventional
vs. adaptive
16Criterion-Referenced Tests
- Specify objectives aids learning
- Give test to two groups
- Exposed vs. not
- Antimode cutting score
- Any problems with this?
17Test Manuals
- Proprietary - qualifications
- Nonproprietary
- Standards for Educational and Psychological
Testing - reflects changes in federal law and measurement
trends affecting validity - testing individuals with disabilities or
different linguistic backgrounds - new types of tests as well as new uses of
existing tests
Taken from apa.org
18Test Manuals
- Should include
- How to administer (standard conditions)
- How to score
- How to interpret
- Information on reliability, validity, norms
- Be critical!
19Base Rates and Hit Rates
- What does this test contribute beyond what is
already know? - Cutting score not necessarily correct decision
- Hit rate vs. base rate comparison
- False negatives and false positives
20Taylor-Russell Tables
- What does the test contribute beyond base?
- Need
- Definition of success
- Base rate
- Selection ratio
- Test validity coefficient
- Determines likelihood someone selected on basis
of test will succeed
21Taylor-Russell Tables
Source Fisher, Schoenfeldt, Shaw (2003), Table
7.2
22Taylor-Russell Tables
- Best validity high, selection rate low
- Bad validity low, selection rate high
- Useless no validity
- Selecting low scorers?
23Incremental Validity
- Unique information from using a test
- Predicting future behavior and self-ratings
- Prediction should consider
- Simpler method?
- Less expensive method?
- Less subject strain?
24Mental Measurements Yearbook