Identifying Causes of Differential Item Functioning Using Optimal Appropriateness Measurement

1 / 14

About This Presentation

Title:

Identifying Causes of Differential Item Functioning Using Optimal Appropriateness Measurement

Description:

Remove aberrant examinees before analyzing the psychometric properties of a test. Attempt to control factors that contributed to aberrant responding ... –

Number of Views:54

Avg rating:3.0/5.0

Slides: 15

Provided by: Stephe4

Category:

more less

Transcript and Presenter's Notes

Title: Identifying Causes of Differential Item Functioning Using Optimal Appropriateness Measurement

1
Identifying Causes of Differential Item
Functioning Using Optimal Appropriateness
Measurement

Sasha Chernyshenko, Stephen Stark,
and Fritz Drasgow
University of Illinois at Urbana-Champaign

2
Research Issue

DIF may bias ability estimates and adversely
affect hiring decisions.
When DIF is found, practitioners are often
advised to eliminate or replace suspect items.
But, item writing is time consuming and expensive
No guarantee that revised items would not exhibit
DIF
Thus, before revising a test, one should attempt
to identify potential sources of DIF.

3
Causes of DIF

DIF occurs when subgroups differ on secondary
dimensions that are unaccounted for by
unidimensional models
Potential sources of DIF include
Educational background
Test-taking strategies
Unmotivated responding

4
Overview of this Study

Examined unmotivated responding as a potential
source of DIF on national licensing exam
Unmotivated responding was modeled using optimal
appropriateness measurement (OAM) methods
DIF results were compared before and after
removing examinees who were identified as
unmotivated

5
Factors that Affect Test-Taking Motivation

Beliefs about test validity and fairness
Characteristics of selection/certification
process
In compensatory systems, good performance on one
exam can make up for poor performance on another
For professional licensing, one must usually
demonstrate competency in various subdomains
Multiple exams required
Not all exams must be passed at same time
Typically, a window of several months is allowed
for passing

6
Factors that Affect Test-Taking Motivation
Professional Licensing Exam

Licensing exam consists of 4 subtests
To become certified, candidates must pass all
subtests in an 18 month window
Must pass two subtests and earn minimal scores on
others, or all exams must be retaken
Offers incentives for examinees to engage in
strategic preparation
Respondents may be unmotivated on two subtests

7
Unmotivated Responding Affects Psychometric
Properties of Exam

Affects classical test theory statistics and IRT
item parameters
Increases exam dimensionality
Contributes to differential item/test functioning

8
Identifying Unmotivated Examinees

Optimal appropriateness measurement can be used
to identify unmotivated examinees (OAM Levine
Drasgow, 1988)
Method
Specify models for normal and unmotivated
responding
For each examinee, compute marginal likelihood of
response pattern for each model
Get likelihood ratio (LR), and classify examinee
as motivated or unmotivated

9
Marginal Likelihood for Normal Model

Assume a single, general ability underlies
performance on all four subtests

where
is a standard normal density function, and
n is the number of items in a subtest
10
Marginal Likelihood for Aberrant Model

Assume examinee is unmotivated on two of four
subtests, and, thus, responds based on two
separate abilities
m1 and m2 represent numbers of items on highest
subtests
m3 and m4 represent numbers of items on lowest
subtests

11
OAM Analyses of Licensing Exam Data

Data N40,029
Estimated 3PLM item parameters using BILOG
Computed LR value for each examinee
Based on simulation study, chose LR10 as cut
score for classification (1 FP)
If LRgt10, then unmotivated

12
DIF Analyses

Randomly sampled groups of White and Black
examinees (N1600) for one subtest (121 items)
Used ITERLINK program to link the metrics and
compute Lords chi-square DIF statistics
Removed 440 examinees with LRgt10
Repeated DIF analyses using only motivated
examinees (N1160)
To control for sample size sensitivity,
Redone DIF analyses with random samples of
N1160.

13
Results and Conclusions

Results
Initial sample (N1600) 57 DIF items
Motivated only (N1160) 20 DIF items
Undifferentiated (N1160) 32 DIF items
Conclusions
Unmotivated responding increased the number of
items identified as problematic
If 5 FP rate had been chosen, the number of DIF
items would have decreased further

14
Implications

Findings of DIF dont necessarily indicate
problems with item content
Extraneous factors may induce aberrant
responding (e.g., low motivation, faking)
When considering test revision, one should
Remove aberrant examinees before analyzing the
psychometric properties of a test
Attempt to control factors that contributed to
aberrant responding

Write a Comment

User Comments (0)