Title: Classical and Bayesian Computerized Adaptive Testing Algorithms
1Classical and Bayesian Computerized Adaptive
Testing Algorithms
- Richard J. Swartz
- Department of Biostatistics (rswartz_at_mdanderson.or
g)
2Outline
- Principle of computerized adaptive testing
- Basic statistical concepts and notation
- Trait estimation methods
- Item selection methods
- Comparisons between methods
- Current CAT Research Topics
3Computerized Adaptive Tests (CAT)
- First developed for assessment testing
- Test tailored to an individual
- Only questions relevant to individual trait level
- Shorter tests
- Sequential adaptive selection problem
- Requires item bank
- Fit with IRT models
- Extensive initial development before CAT
implementation
4Item Bank Development I
- Qualitative item development
- Content experts
- Response categories
- Test model fit
- Likelihood ratio based methods
- Model fit indices
5Item Bank Development II
- Test Assumption Unidimensionality
- Factor analysis
- Confirmatory factor analysis
- Multidimensional IRT models
- Test assumption Local Dependence
- Residual correlation after 1st factor removed
- Multidimensional IRT models
6Item Bank Development III
- Test assumption Invariance
- DIF differential item functioning
- Over time and across groups (i.e. men vs. women)
- Across groups
- Many different methods (Logistic Regression
method, Area between response curves, and others)
7CAT Implementation
Hi Depression
3
7
4
13
6
Item bank
8
c
15
5
12
2
9
b
14
10
11
b
1
Lo Depression
8CAT Item Selection
9Basic Concepts/ Notation
10Basic Concepts/ Notation II
11Trait Estimation
12Estimating Traits
- Assumes Item parameters are known
- Represent the individuals ability
- Done sequentially in CAT
- Estimate is updated after each additional
response - Maximum Likelihood Estimator
- Bayesian Estimators
13Likelihood
- Model describing a persons response pattern
14Maximum Likelihood Estimate
- Frequentist likely value to generate the
responses - Consistency, efficiency depend on selection
methods and item bank used. - Does not always exist
15Bayesian Framework
- ? is a random variable
- A distribution on ? describes knowledge prior to
data collection (Prior distribution) - Update information about ? (Trait) as data is
collected (Posterior distribution) - Describes distribution of ? values instead of a
point estimate
16Bayes Rule
- Combines information about ? (prior) with
information from the data (Likelihood)
- Posterior ? Likelihood Prior
17Maximum A Posteriori (MAP) Estimate
- Properties
- Uniform Prior equivalent to MLE over support of
the prior, - For some prior/likelihood combinations, Posterior
can be multimodal
18Expected A Posteriori (EAP) Estimate
- Properties
- Always exists for a proper prior
- Easy to calculate with numerical integration
techniques - Prior influences estimate
19Posterior Variance
- Describes variability of ?
- Can be used as conditional Standard Error of
Measurement (SEM) for a given response pattern.
20 ITEM SELECTION
21Item Selection Algorithms
- Choose the item that is best for the individual
being tested - Define best
- Most information about trait estimate
- Greatest reduction in expected variability of
trait estimate
22Fishers Information
- Information of a given item at a trait value
23Maximum Fishers Information
- Myopic algorithm
- Pick the item ik at stage k, (ik ? Rk) that
maximizes Fishers information at current trait
estimate, (Classically MLE)
24MFI - Selection
25Minimum Expected Posterior Variance (MEPV)
- Selects items that yields the minimum predicted
Posterior variance given previous responses - Uses predictive distribution
- Is a myopic Bayesian decision theoretic approach
(minimizes Bayes risk) - First described by Owen (1969, 1975)
26Predictive Distribution
- Predict the probability of a response to an item
given previous responses
27Bayesian Decision Theory
- Dictates optimal (sequential adaptive) decisions
- In addition to prior and Likelihood, specify a
loss function (squared error loss)
28Bayesian Decision Theory Item Selection
- Optimal estimator for Squared-error loss is
posterior mean (EAP) - Select item that minimizes Bayes risk
29Minimum Expected Posterior Variance (MEPV)
- Pick the item ik remaining in the bank at stage
k, (ik ? Rk) that minimizes the expected
posterior variance (with respect to the
predictive distribution)
30Other Information Measures
- Weighted Measures
- Maximum Likelihood weighted Fishers
Information(MLWI) - Maximum Posterior Weighted Fishers Information
(MPWI) - Kulback-Leibler Information Global Information
Measure
31Hybrid Algorithms
- Maximum Expected Information (MEI)
- Use observed information
- Predict information for next item
- Maximum Expected Posterior Weighted Information
(MEPWI) - Use observed information
- Predict information for next item
- Weight with Posterior
- MEPWI ? MPWI
32Mix N Match
- MAP with uniform prior to approximate MLE
- MFI using EAP instead of MLE (any point
information function) - Use EAP for item selection, but MFI for final
trait estimate
33COMPARISONS
34Study Design
- Real Item Bank
- Depressive symptom items (62)
- 4 categories (fit with Graded Response IRT Model)
- Peaked Bank Items have narrow coverage
- Flat Bank Items have wider coverage
- fixed length 5, 10, 20-item CATs
35Datasets Used
- Post hoc simulation using real data
- 730 patients and caregivers at MDA
- Real bank only
- Simulated data
- q grid -3 to 3 by .5
- 500 simulees per q
- Simulated and Real banks
36Real Item Bank Characteristics
37Real Bank, Real Data, 5 Items
38Real Bank, Real Data, 5 items
39Peaked Bank, Sim. Data, 5 Item
40Peaked Bank, Sim. Data, 5 Item
41Summary
- Polytomous items
- Choi and Swartz, In press
- Classic MFI with MLE, and MLWI not as good as
others. - MFI with EAP, and all others essentially perform
similarly. - Dichotomous items
- (van der Linden, 1998)
- MFI with MLE not as good as all others
- Difference more pronounced for shorter tests
42Adaptations/ Active Research Areas
- Constrained adaptive tests/ content balancing
- Exposure Control
- A-stratified adaptive testing
- Item selection including burden
- Cheating detection
- Response times
43(No Transcript)
44References and Further Reading
- Choi SW Swartz RJ. Â (in press) Comparison of CAT
Item Selection Criteria for Polytomous Items
Applied psychological Measurement. - Owen RJ (1969) A Bayesian approach to tailored
testing (Research report 69-92) Princeton, NJ
Educational Testing Service - Owen RJ (1975). A Bayesian Sequential Procedure
for quantal response in the context of adaptive
mental testing. Journal of the American
Statistical Association, 70, 351-356. - van der Linden WJ. (1998). Bayesian item
selection criteria for adaptive testing
Psychometrika, 2, 201-216. - van der Linden WJ. Glas, C. A. W. (Eds).
(2000). Computerized Adaptive Testing Theory and
Practice. Dordrecht Boston Kluwer Academic.
45(No Transcript)
46MLE Properties
- Usually has desirable asymptotic properties
- Consistency and efficiency depend on selection
criteria and item bank - Finite estimate does not exist for repeated
responses in categories 1 or m