Item Response Theory - PowerPoint PPT Presentation

1 / 27

About This Presentation

Title:

Item Response Theory

Description:

Item Response Theory Shortcomings of Classical True Score Model Sample dependence Limitation to the specific test situation. Dependence on the parallel forms Same ... – PowerPoint PPT presentation

Number of Views:235

Avg rating:3.0/5.0

Slides: 28

Provided by: lin108

Category:

more less

Transcript and Presenter's Notes

Title: Item Response Theory

1
Item Response Theory
2
Shortcomings of Classical True Score Model

Sample dependence
Limitation to the specific test situation.
Dependence on the parallel forms
Same error variance for all

3
Sample Dependence

The first shortcoming of CTS is that the values
of commonly used item statistics in test
development such as item difficulty and item
discrimination depend on the particular examinee
samples in which they are obtained. The average
level of ability and the range of ability scores
in an examinee sample influence, often
substantially, the values of the item statistics.
Difficulty level changes with the level of
samples ability and discrimination index is
different between heterogeneous sample and the
homogeneous sample.

4
Limitation to the Specific Test Situation

The task of comparing examinees who have taken
samples of test items of differing difficulty
cannot easily be handled with standard testing
models and procedures.

5
Dependence on the Parallel Forms

The fundamental concept, test reliability, is
defined in terms of parallel forms.

6
Same Error Variance For All

CTS presumes that the variance of errors of
measurement is the same for all examinees.

7
Item Response Theory

The purpose of any test theory is to describe how
inferences from examinee item responses and/or
test scores can be made about unobservable
examinee characteristics or traits that are
measured by a test.
An individuals expected performance on a
particular test question, or item, is a function
of both the level of difficulty of the item and
the individuals level of ability.

8
Item Response Theory

Examinee performance on a test can be predicted
(or explained) by defining examinee
characteristics, referred to as traits, or
abilities estimating scores for examinees on
these traits (called "ability scores") and using
the scores to predict or explain item and test
performance. Since traits are not directly
measurable, they are referred to as latent traits
or abilities. An item response model specifies a
relationship between the observable examinee test
performance and the unobservable traits or
abilities assumed to underlie performance on the
test.

9
Assumptions of IRT

Unidimensionality
Local independence

10
Unidimensionality Assumption

It is possible to estimate an examinee's ability
on the same ability scale from any subset of
items in the domain of items that have been
fitted to the model. The domain of items needs to
be homogeneous in the sense of measuring a single
ability If the domain of items is too
heterogenous, the ability estimates will have
little meaning.
Most of the IRT models that are currently being
applied make the specific assumption that the
items in a test measure a single, or
unidimensional ability or trait, and that the
items form a unidimensional scale of measurement.

11
Local Independence

This assumption states that an examinee's
responses to different items in a test are
statistically independent. For this assumption to
be true, an examinee's performance on one item
must not affect, either for better or for worse,
his or her responses on any other items in the
test.

12
Item Characteristic Curves

Specific assumptions about the relationship
between the test taker's ability and his
performance on a given item are explicitly stated
in the mathematical formula, or item
characteristic curve (ICC).

13
Item Characteristic Curves

The form of the ICC is determined by the
particular mathematical model on which it is
based. The types of information about item
characteristics may include
(1) the degree to which the item discriminates
among individuals of differing levels of ability
(the 'discrimination' parameter a)

14
Item Characteristic Curves

(2) the level of difficulty of the item (the
'difficulty' parameter b), and
(3) the probability that an individual of low
ability can answer the item correctly (the
'pseudo-chance' or 'guessing' parameter c).
One of the major considerations in the
application of IRT models, therefore, is the
estimation of these item parameters.

15
ICC

pseudo-chance parameter c p0.20 for two items
difficulty parameter b halfway between the
pseudo-chance parameter and one
discrimination parameter a proportional to the
slop of the ICC at the point of the difficulty
parameter The steeper the slope, the greater the
discrimination parameter.

Probability
Ability Scale
16
Ability Score

1. The test developer collects a set of observed
item responses from a relatively large number of
test takers.
2. After an initial examination of how well
various models fit the data, an IRT model is
selected.
3. Through an iterative procedure, parameter
estimates are assigned to items and ability
scores to individuals, so as to maximize the
agreement, or fit between the particular IRT
model and the test data.

17
Ability Score
18
Item Information Function

The limitations on CTS theory approaches to
precision of measurement are addressed in the IRT
concept of information function. The item
information function refers to the amount of
information a given item provides for estimating
an individual's level of ability, and is a
function of both the slope of the ICC and the
amount of variation at each ability level.
The information function of a given item will be
at its maximum for individuals whose ability is
at or near the value of the difficulty parameter.

19
Item Information Function
20
Item Information Function
21
Item Information Function

The information function of a given item will be
at its maximum for individuals whose ability is
at or near the value of the difficulty parameter.
(1) provides the most information about
differences in ability at the lower end of the
ability scale.
(2) provides relatively little information at any
point on the ability scale.
(3) provides the most information about
differences in ability at the high end of the
ability scale.

22
Test Information Function

The test information function (TIF) is the sum of
the item information functions, each of which
contributes independently to the total, and is a
measure of how much information a test provides
at different ability levels.
The TIF is the IRT analog of CTS theory
reliability and the standard error of
measurement.

23
Item Bank

If there is a need for regular test
administration and analysis, the construction of
item bank may be taken into consideration.
Item bank is not a simple collection of test
items that is organized in their raw form, but
with parameters assigned on the basis of CTS or
IRT models.
Item bank should also have a data processing
system that assures the steady quality of the
data in the bank (describing, classifying,
accepting, and rejecting items)

24
Specifications in CTS Item Bank