Title: IRT basics: Theory and parameter estimation
1IRT basics Theory and parameter estimation
- Wayne C. Lee, David Chuah, Patrick Wadlington,
Steve Stark, Sasha Chernyshenko
2Overview
- How do I begin a set of IRT analyses?
- What do I need?
- Software
- Data
- What do I do?
- Input/ syntax files
- Examination of output
3Eye-ARE-What?
- Item response theory (IRT)
- Set of probabilistic models that
- Describes the relationship between a respondents
magnitude on a construct (a.k.a. latent trait
e.g., extraversion, cognitive ability, affective
commitment) - To his or her probability of a particular
response to an individual item
4But what does that buy you?
- Provides more information than classical test
theory (CTT) - Classical test statistics depend on the set of
items and sample examined - IRT modeling not dependent on sample examined
- Can examine item bias/ measurement equivalence
and provide conditional standard errors of
measurement
5Before we begin
- Data preparation
- Raw data must be recoded if necessary (negatively
worded items must be reverse coded such that all
items in the scale indicate a positive direction) - Dichotomization (optional)
- Reducing multiple options into two separate
values (0, 1 right, wrong)
6Calibration and validation files
- Data is split into two separate files
- Calibration sample for estimating IRT parameters
- Validation sample for assessing the fit of the
model to the data - Data files for the programs that we will be
discussing must be in ASCII/ text format
7Investigating dimensionality
- The models presented make a common assumption of
unidimensionality - Hattie (1985) reviewed 30 techniques
- Some propose the ratio of the 1st eigenvalue to
the 2nd eigenvalue (Lord, 1980) - On-line we describe how to examine the
eigenvalues following Principal Axis Factoring
(PAF)
8PAF and scree plots
- If the data are dichotomous, factor analyze
tetrachoric correlations - Assume continuum underlies item responses
Dominant first factor
9Two models presented
- The Three Parameter Logistic model (3PL)
- For dichotomous data
- E.g., cognitive ability tests
- Samejima's Graded Response model
- For polytomous data where options are ordered
along a continuum - E.g., Likert scales
Common models among applied psychologists
10The 3PL model
- Three parameters
- a item discrimination
- b item extremity/ difficulty
- c lower asymptote, pseudo-guessing
- Theta refers to the latent trait
11Effect of the a parameter
12Effect of the a parameter
13Effect of the b parameter
14Effect of the b parameter
b inversely proportional to CTT p
15Effect of the c parameter
16Effect of the c parameter
17Estimating 3PL parameters
- DOS version of BILOG (Scientific Software)
- Multiple files in directory, but small size
overall - Easier to estimate parameters for a large number
of scales or experimental groups - Data file must be saved as ASCII text
- ID number
- Individual responses
- Input file (ASCII text)
18BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
19BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
20BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
21BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
22BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
23BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
Estimation specifications (not the default for
BILOG)
24BILOG input file (.BLG)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- gtGLOBAL DFN'AGR2_CAL.DAT', NIDW4, NPARM3,
OFNAME'OMIT.KEY', SAVE - gtSAVE SCO 'AGR2_CAL.SCO', PARM
'AGR2_CAL.PAR', COV 'AGR2_CAL.COV' - gtLENGTH NITEMS(10)
- gtINPUT SAMPLE99999
- (4A1,10A1)
- gtTEST TNAMEAGR
- gtCALIB NQPT40, CYC100, NEW30, CRIT.001,
PLOT0 - gtSCORE MET2, IDIST0, RSC0, NOPRINT
25Phase one output file (.PH1)
- CLASSICAL ITEM STATISTICS FOR SUBTEST AGR
- NUMBER NUMBER ITEMTEST CORRELATION
- ITEM NAME TRIED RIGHT PERCENT LOGIT/1.7
PEARSON BISERIAL - --------------------------------------------------
------------------- - 1 0001 1500.0 1158.0 0.772 0.72 0.535 0.742
- 2 0002 1500.0 991.0 0.661 0.39 0.421 0.545
- 3 0003 1500.0 1354.0 0.903 1.31 0.290 0.500
- 4 0004 1500.0 1187.0 0.791 0.78 0.518 0.733
- 5 0005 1500.0 970.0 0.647 0.36 0.566 0.728
- 6 0006 1500.0 1203.0 0.802 0.82 0.362 0.519
- 7 0007 1500.0 875.0 0.583 0.20 0.533 0.674
- 8 0008 1500.0 810.0 0.540 0.09 0.473 0.594
- 9 0009 1500.0 1022.0 0.681 0.45 0.415 0.542
- 10 0010 1500.0 869.0 0.579 0.19 0.426 0.538
- --------------------------------------------------
-------------------
Can indicate problems in parameter estimation
26Phase two output file (.PH2)
- CYCLE 12 LARGEST CHANGE 0.00116
- -2 LOG LIKELIHOOD 15181.4541
- CYCLE 13 LARGEST CHANGE 0.00071
- FULL NEWTON STEP
- -2 LOG LIKELIHOOD 15181.2347
- CYCLE 14 LARGEST CHANGE 0.00066
27Phase three output file (.PH3)
- Theta estimation
- Scoring of individual respondents
- Required for DTF analyses
28Parameter file (specified, .PAR)
- AGREEABLENESS CALIBRATION FOR IRT TUTORIAL.
- gtCOMMENT
- 1 10
- 10
- 0001AGR 111 1.130784 1.533393
-0.737439 0.652148 0.147203 - 0.101834 0.185726
0.135455 0.078989 0.053688 - 0002AGR 211 0.360630 0.870309
-0.414371 1.149018 0.132796 - 0.087236 0.097709
0.098866 0.129000 0.054461 - 0003AGR 311 1.474175 0.743095
-1.983831 1.345723 0.197127 - 0.108974 0.084487
0.250499 0.153003 0.087578 - 0004AGR 411 1.196368 1.256263
-0.952323 0.796012 0.090901 - 0.087856 0.114710
0.123613 0.072684 0.042937 - 0005AGR 511 0.544388 1.403904
-0.387767 0.712300 0.056774 - 0.071490 0.133486
0.080438 0.067727 0.026086 - 0006AGR 611 0.892399 0.777440
-1.147869 1.286273 0.173882 - 0.093109 0.082096
0.152846 0.135828 0.075829 - 0007AGR 711 0.174395 1.369223
-0.127368 0.730341 0.088135 - 0.083777 0.159712
0.085084 0.085190 0.032376
(32X,2F12.6,12X,F12.6)
29PARTO3PL output (.3PL)
- 0001AGR 111 1.130784 1.533393
-0.737439 0.652148 0.147203 - 0002AGR 211 0.360630 0.870309
-0.414371 1.149018 0.132796 - 0003AGR 311 1.474175 0.743095
-1.983831 1.345723 0.197127 - 0004AGR 411 1.196368 1.256263
-0.952323 0.796012 0.090901 - 0005AGR 511 0.544388 1.403904
-0.387767 0.712300 0.056774 - 0006AGR 611 0.892399 0.777440
-1.147869 1.286273 0.173882 - 0007AGR 711 0.174395 1.369223
-0.127368 0.730341 0.088135 - 0008AGR 811 0.042231 0.979045
-0.043135 1.021403 0.056546 - 0009AGR 911 0.441586 0.839144
-0.526234 1.191691 0.129646 - 0010AGR 1011 0.104452 0.879683
-0.118738 1.136773 0.101087
a b c
30Scoring and covariance files
- Like the .PAR file, specifically requested
- .COV - Provides parameters as well as the
variances/covariances between the parameters - Necessary for DIF analyses
- .SCO - Provides ability score information for
each respondent
31Samejima's Graded Response model
- Used when options are ordered along a continuum,
as with Likert scales - v response to the polytomously scored item i
- k particular option
- a discrimination parameter
- b extremity parameter
32Sample SGR Plot
33Sample SGR Plot
34Running MULTILOG
- MULTILOG for DOS
- Example with DOS batch file
- INFORLOG with MULTILOG
- INFORLOG is typically interactive
- Process automated with batch file and an input
file (described on-line) - .IN1 (parameter estimation)
- .IN2 (scoring)
35The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Title line
36The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Number of items, examinees, characters in the ID
field, single group
37The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
38The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Number of cycles for estimation
End of command syntax
39The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Five characters Denoting five options
40The first input file (.IN1)
- CALIBRATION OF AGREEABLENESS GRADED RESPONSE
MODEL - gtPRO IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtEST NC50
- gtSAVE
- gtEND
- 5
- 01234
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Recoding of options for MULTILOG
41The second input file (.IN2)
- SCORING AGREEABLENESS SCALE SGR MODEL
- gtPRO SCORE IN RA NI10 NE1500 NCHAR4 NG1
- gtTEST ALL GR NC(5,5,5,5,5,5,5,5,5,5)
- gtSTART
- Y
-
- gtSAVE
- gtEND
- 5
- 12345
- 1111111111
- 2222222222
- 3333333333
- 4444444444
- 5555555555
- (4A1,10A1)
Scoring
Yes to INFORLOG (parameters in a separate file)
42Running MULTILOG
- Run the batch file
- .IN1 ? .LS1 (.lis file renamed as .ls1)
- ensure that the data were read in and the model
specified correctly - also provides a report of the estimation
procedure with the estimated item parameters - Things of note
43Collapsing options
44Scoring output
- .IN2 ? .LS2
- Last portion of the file contains the person
parameters (estimated theta, standard error, the
number of iterations used, and the respondent's
ID number).
45What now?
- Review
- Data requirements for IRT
- Two models 3PL (dichotomous), SGR (polytomous),
more on-line! - MODFIT
- Can plot IRFs, ORFs
- Model-data fit Input parameters, validation
sample