Creating Valid and Reliable Classroom Tests - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Creating Valid and Reliable Classroom Tests

Description:

Creating Valid and Reliable Classroom Tests. Session IV: ... Writing Essay and Short-Answer Tests. Rules for Writing Constructed ... Biserial Correlation ... – PowerPoint PPT presentation

Number of Views:119
Avg rating:3.0/5.0
Slides: 28
Provided by: jimwo9
Category:

less

Transcript and Presenter's Notes

Title: Creating Valid and Reliable Classroom Tests


1
Creating Valid and Reliable Classroom Tests
  • James A. Wollack, PhD
  • John Siegler, PhD
  • Taehoon Kang
  • Craig S. Wells
  • Testing Evaluation Services

2
Creating Valid and Reliable Classroom
TestsSession IV Evaluating the Test
  • Recap of Session III
  • Item analysis overview
  • Requesting TE analyses
  • Completing the SRF form
  • Explanation of output
  • Item analysis and item revision exercise
  • Question Answer Session
  • Workshop Evaluation

3
Recap of Session IIIWriting Essay and
Short-Answer Tests
  • Rules for Writing Constructed-Response Items
  • Scoring Considerations
  • Developing Scoring Rubrics
  • Group Exercise Developing Scoring Rubric
  • Question Answer Session

4
The Testing Cycle
  • Typical classroom testing
  • Item Development Test
    Administration
  • Scoring

5
The Testing Cycle
  • Better classroom testing
  • Item Development Test
    Administration
  • Scoring

Test Blueprint
6
The Testing Cycle
  • Ideal model for classroom testing
  • Test data should inform you about the
    appropriateness of the content and the
    effectiveness of the individual items in future
    exams.
  • Students in your classes change, but assessment
    is ongoing

7
Item Evaluation
  • People spend a lot of time developing items, but
    too often dont analyze how well the items worked
  • Administering the test will provide lots of data
    that can be used to study items.
  • Item analysis
  • Provides breakdown of how different types of
    students performed on various aspects of each
    item.
  • Particularly useful for multiple-choice items

8
Item Analysis Overview
  • Item analysis can help answer the following
    questions
  • How hard is this item?
  • How well does performance on this item predict
    overall achievement level?
  • Are students finding the item distractors
    attractive?
  • Is the item confusing?
  • Does the item have more than one right answer?
  • For what type of student is this item ideal?
  • Is the timing of the test appropriate?

9
Sample Item Analysis for One Item
  • PERCENT RESPONDING CORRECTLY BY QUINTILE
    MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH
    5TH 9 2 2 3 0 0 0
  • 4TH 4TH
    7 1 6 3 0 0 0
  • 3RD 3RD 4 2 7 3 0 0 0
  • 2ND 2ND 2 6 7 2 0 0 0
  • 1ST 1ST
    7 4 3 1 0 0 1

  • PROP
    0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • 0 10 20 30 40 50 60
    70 80 90 100 RPBI 0.18 -0.21 -0.07 0.
    11 0.00 0.00 -0.09
  • IA contains two parts
  • picture on left
  • matrix of numbers on right

10
Left Hand Side of Item Analysis
  • PERCENT RESPONDING CORRECTLY BY QUINTILE
  • 5TH
  • 4TH
  • 3RD
  • 2ND
  • 1ST

  • 0 10 20 30 40 50 60
    70 80 90 100

Students are divided into quintile groups based
on total score Top quintile (5th) includes the
top 20 of the students 4th quintile includes
students in the 61st 80th percentiles ? 1s
t quintile includes students in the 1st 20th
percentiles
11
Left Hand Side of Item Analysis
  • PERCENT RESPONDING CORRECTLY BY QUINTILE
  • 5TH
  • 4TH
  • 3RD
  • 2ND
  • 1ST

  • 0 10 20 30 40 50 60
    70 80 90 100

Percentage of students in each quintile group
answering item correctly Ideally these points
will form a straight line with relatively flat
slope i.e., large jumps in correct for each
unit increase in quintile Picture is often not
clean, particularly with fewer than 100
examinees. At a minimum, picture should have
positive slope Picture is a heuristic deviceuse
cautiously.
12
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Students are again divided into quintile groups
based on total score
13
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

A E correspond to item alternatives O omits
(i.e., item not answered) M multiple (i.e.,
more than one answer selected)
14
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Indicates the number of students in each quintile
group who selected each item alternative.
6 students in the 4th quintile selected
alternative C.
Want to see numbers decreasing from 5th to 1st
quintile for key, and increasing from 5th to 1st
quintile for distractors
15
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Short for Proportion Indicates the proportion of
students in each column
PROP for correct answer (shown in brackets) is
referred to as the item difficulty PROP for
incorrect answers are called distractor
difficulties
16
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Item difficulties range from 0.00 to 1.00. Hard
items have difficulties less than 0.35 Easy
items have difficulties above 0.85 Items that
are too hard or too easy will not contribute much
to the test reliability
17
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Short for Point Biserial Correlation Indicates
the correlation between a students score on the
item alternative (1 selected, 0 not
selected), and their total score on the test
RPBI for correct answer is referred to as the
item discrimination RPBI for incorrect answers
are called distractor discriminations
18
Item Discrimination
  • Range from -1.0 to 1.0
  • Interpreting the sign
  • Positive values mean that students who selected
    the alternative tended to have high scores and
    students who did not select the alternative
    tended to have low scores.
  • The RPBI for the key (i.e., item discrimination)
    should be positive.
  • Negative values mean that students who selected
    the alternative tended to have low scores and
    students who did not select the alternative
    tended to have high scores.
  • The RPBI for the distractors should be negative.
  • Values near zero mean that there is no
    relationship between that item alternative and
    total score.

19
Item Discrimination
  • Range from -1.0 to 1.0
  • Interpreting the magnitude
  • Values of 1.0 (or -1.0) mean that there is a
    perfect linear relationship between selecting the
    alternative and total score.
  • Will never happen in practice.
  • On classroom tests, discriminations rarely get
    above .65 in absolute magnitude.
  • The higher the values, the better that choice is
    able to discriminate between strong and weak
    students.

20
What Are We Looking For In An Item?
  • Item Difficulty
  • Ideally, should be between .35 and .85
  • Items that are too easy or too hard will often
    not discriminate well
  • Distractor Difficulties
  • Should be at least .02
  • Item Discrimination
  • At least 0.20 for classroom exams
  • Higher is better
  • .30 or higher for standardized measures.
  • Distractor Discriminations
  • All should be negative
  • The more negative, the better
  • The larger the distractor difficulty, the
    stronger the distractor discrimination should be
  • RPBI -0.05, PROP 0.08 OK
  • RPBI -0.05, PROP 0.25 problem with
    alternative

21
Using Item Analyses to Guide Item Revision
  • Items with negative or low positive RPBIs should
    be either revised or deleted from item bank.
  • To understand how to revise, if at all, look at
    distractor characteristics
  • Distractors with RPBIs that are either positive
    or negative but too low considering the PROP,
    should be replaced.
  • Consider replacing distractors that are selected
    by too many or too few people
  • Dont change if the rest of item is working well
  • For an item to be revised successfully, it is
    often necessary to have at least one solid
    distractor that will not be changed.
  • If either all distractors are poor, or none is
    particularly strong, delete item and write a
    brand new one.
  • Change only pieces of the item that caused
    problems
  • If an item fails, is revised, and fails again,
    delete it and write a new item.

22
Right Hand Side of Sample Item Analysis
  • MATRIX RESPONDING BY QUINTILE
  • A B C D E O M
  • 5TH 9 2 2 3 0 0 0
  • 4TH 7 1 6 3 0 0 0
  • 3RD 4 2 7 3 0 0 0
  • 2ND 2 6 7 2 0 0 0
  • 1ST 7 4 3 1 0 0 1
  • PROP 0.35 0.18 0.30 0.15 0.00 0.00 0.01
  • RPBI 0.18 -0.21 -0.07 0.11 0.00 0.00 -0.09

Item Discrimination is lower than desired
Item is pretty hard
Alternative (D) has a positive discrimination
Alternative (C) has a low discrimination, given
its difficulty
Alternative (B) is working very well
Revision Decision certainly replace (D),
consider replacing (C) also.
23
Requesting Item Analyses and Test Scoring
  • Testing Evaluation Services
  • 373 Educational Sciences Bldg.
  • Pick up scanable answer sheets before testing
  • Requesting Output
  • Service Request Form (SRF) describes the nature
    of your data and the types of output you want.

24
Review of Item Analysis for Workshop Test
  • Divide into 4 groups
  • A Er with Jim in Front
  • Es J with John in Middle
  • K R with Taehoon in Back by entry
  • S Z with Craig in Back on other side

25
Questions?
26
Thanks for Coming and ParticipatingWorkshop
Scheduled to run again in OctoberThanks to the
UW Teaching Academy
27
Please Complete a Workshop Evaluation Form
Write a Comment
User Comments (0)
About PowerShow.com