Title: Introduction to Bayesian Inference in Item Response Theory
1Introduction to Bayesian Inference in Item
Response Theory
- Robert J. Mislevy
- University of Maryland
- March 31, 2003
2Topics
- What is item response theory (IRT)?
- Examples with the Rasch model
- A full Bayesian model for IRT
- Extensions
3What is IRT? (1)
- Under CTT, measures of examinees are confounded
with the characteristics of test items. Cant
compare examinees who have taken different tests
or items that have been administered to different
groups of examinees. - Item Response Theory (IRT) can be used to make
predictions about test properties using item
properties and to manipulate parts of tests to
achieve targeted measurement properties.
4What is IRT? (2)
- As under CTT, a single variable measures
students overall proficiency in some domain of
tasks. - The structure of the probability model is the
same as for CTT conditional independence among
observations given an underlying, inherently
unobservable, proficiency variable q. - But now the observations are responses to
individual tasks.
5What is IRT? (3)
6What is IRT? (4)
- For Item j, the IRT model expresses the
probability of a given response xj as a function
of q and parameters bj that characterize Item j
(such as its difficulty) - f(xjq,bj).
7The Rasch model for dichotomous (right/wrong)
items
- Prob(Xij1qi,bj) f(1qi,bj) Y(qi - bj),
where - Xij is response of Student i to Item j, 1
right, 0 wrong - qi is the proficiency parameter of Student i
- bj is the difficulty parameter of Item j
- Y(x) is the logistic function, Y(x)
exp(x)/1exp(x). - The probability of an incorrect response is then
- Prob(Xij0qi,bj) f(0qi,bj) 1-Y(qi - bj).
8The Rasch model for dichotomous (right/wrong)
items
- Two Rasch model curves, with b1-1 and b22.
9The Rasch model for dichotomous (right/wrong)
items
- Conditional independence means that for a given
value of q, the probability of Student i making
responses xi1 and xi2 to the two items is the
product of probabilities item by item, given q - Â Prob(Xi1xi1, Xi2xi2qi,b1,b2)
- Prob(Xi1xi1qi,b1) Prob(Xi2xi2qi,b2).
10The Rasch model for dichotomous (right/wrong)
items
MLE.75
- The IRT Likelihood Function Induced by Observing
Xi10 and Xi21
11The Rasch model for dichotomous (right/wrong)
items
- N(0,1) Prior Distribution for q
12The Rasch model for dichotomous (right/wrong)
items
Posterior Mean .30
- Posterior Distribution for q after Observing
Xi10 and Xi21, with N(0,1) Prior
13The Rasch model for dichotomous (right/wrong)
items
MLE -
- The IRT Likelihood Function Induced by Observing
Xi10 and Xi20
14The Rasch model for dichotomous (right/wrong)
items
Posterior Mean .66
- Posterior Distribution for q after Observing
Xi10 and Xi20, with N(0,1) Prior
15A full Bayesian model A generic measurement
model
- Xij Response of Person i to Item j
- qi Parameter(s) of Person i
- bj Parameter(s) of Item j
- h Parameter(s) for distribution of qs
- t Parameter(s) for distribution of bs
- Note Exchangeability assumed here for qs and
for bs--i.e., modeling all with the same prior.
Later well incorporate additional info, about
people and/or items.
16A full Bayesian model The recursive expression
of the model
The measurement model Item response given
person item parameters Distributions for person
parameters Distributions for item
parameters Distribution for parameter(s) of
distributions for item parameters Distribution
for parameter(s) of distributions for person
parameters
17A full Bayesian model A BUGS diagram
bj
pij
qi
t
h
Xij
Items j
Persons i
- Plates for people and items
- Item parameters explicit
- q population distribution structure explicit
- In dichotomous IRT, item person parameters give
probability parameter in a binomial distribution
for the observed response.
18Extensions (1)
- 3-parameter logistic IRT function
-
- aj is item slope or discrimination--steepness of
curve - bj is item difficulty, as in Rasch model
- cj is lower asymptote--probability of getting an
item right even when q is very low.
19Extensions (2)
- Responses X in ordered categories, rather than
just right/wrong (includes attitude scales) - Reponses are unfolding data More likely to
respond positively when attitude expressed by
item is near your opinion, less likely when it
differs either way. - Multivariate q
- Parameters for additional facets of observational
setting--e.g., parameters for rater harshness.
20Extensions (3)
- Collateral information Z about students
- Means modeling distribution for qi conditional on
zi, and including hyperparameters for those
distributions. - Collateral information Y about items
- Means modeling distribution for bj conditional on
yj, and including hyperparameters for those
distributions. - Conditional dependence among Xs
- As with multiple questions about same reading
passage, ratings of multiple aspects of same
complex performance, or tasks where performance
in one step depends on success of previous steps.