Designing a Classroom Test - PowerPoint PPT Presentation

1 / 45

About This Presentation

Title:

Designing a Classroom Test

Description:

Ensure the test assesses the level or depth of learning you want to measure ... Guidelines for University Faculty (Brigham Young University: (testing.byu.edu ... – PowerPoint PPT presentation

Number of Views:1315

Avg rating:3.0/5.0

Slides: 46

Provided by: kumc

Category:

more less

Transcript and Presenter's Notes

Title: Designing a Classroom Test

1
Designing a Classroom Test

Anthony Paolo, PhD
Director of Assessment Evaluation
Office of Medical Education
Psychometrician for CTC
Teaching Learning Technologies
September 2008

2
Content

Purpose of classroom test
Test blueprint specifications
Item writing
Assembling the test
Item analysis

3
Purpose of Classroom Test

Establish basis for assigning grades
Determine how well each student has achieved
course objectives
Diagnose student problems
Identify areas where instruction needs
improvement
Motivate students to study
Communicate what material is important

4
Test Blueprint

To ensure the test assesses what you want to
measure
Ensure the test assesses the level or depth of
learning you want to measure

5
Blooms Revised Cognitive Taxonomy

Remembering Understanding
Remembering Retrieving, recognizing, recalling
relevant knowledge.
Understanding Constructing meaning from
information through interpreting, classifying,
summarizing, inferring, explaining.
ITEM TYPES MC, T/F, Matching, Short Answer
Applying Analyzing
Applying Implementing a procedure or process.
Analyzing Breaking material into constituent
parts, determining how the parts relate to one
another and to an overall structure or purpose
through differentiating, organizing, and
attributing.
ITEM TYPES MC, Short Answer, Problems, Essay
Evaluating Creating
Evaluating Making judgments based on criteria
standards through checking and critiquing.
Creating Putting elements together to form a
coherent or functional whole reorganizing
elements into a new pattern or structure through
generating, planning, or producing.
ITEM TYPES MC, Essay

6
Test Blueprint
7
Test Specifications

To ensure the test covers the content and/or
objectives in the proper proportions

8
Test Specifications
9
Item Writing General Guidelines1

Present a single clearly defined problem that is
based on a significant concept rather then
trivial or esoteric ideas
Use simple, precise unambiguous wording
Exclude extraneous or irrelevant information
Eliminate any systematic pattern of answers that
may allow guessing correctly

10
Item Writing General Guidelines2

Avoid cultural, racial, ethnic sexual bias.
Avoid presupposed knowledge which favors one
group over another (fly ball favors those that
know baseball)
Refrain from providing unnecessary clues to the
correct answer.
Avoid negatively phrased items (i.e., except,
not)
Arrange answers in alphabetical / numerical order

11
Item Writing General Guidelines3

Avoid None of the above or All of the above
type answers
Avoid Both A B or Neither A or B type
answers

12
Item Writing Correct Answer is

Longer
More qualified or more general
Uses familiar phraseology
Is grammatically correct for item stem
Is 1 of the 2 similar statements
Is 1 of the 2 opposite statements

13
Item Writing Wrong Answer is

Usually the first or last option
Contain extreme words (always, never, nonsense,
etc.)
Contain unexpected language or technical terms
Contain flippant remarks or completely
unreasonable statements

14
Item Writing Grammatical Cues
15
Item Writing Logical Cues
16
Item Writing Absolute Terms
17
Item Writing Word Repeats
18
Item Writing Vague Terms
19
Item Writing Vague Terms
20
Item Writing

Effective test items match the desired depth of
learning as directly as possible
Applying Analyzing
Applying Implementing a procedure or process.
Analyzing Breaking material into constituent
parts, determining how the parts relate to one
another and to an overall structure or purpose
through differentiating, organizing, and
attributing.
ITEM TYPES MC, Short Answer, Problems, Essay

21
Comparison of MC Essay1
22
Comparison of MC Essay2
23
Item Writing - Application

MC application of knowledge items tend to have
long vignettes that require decisions.
Case, et al. at the NBME investigated the impact
of increasing levels of interpretation, analysis
and synthesis required to answer a question on
item performance.
(Academic Medicine, 199671528-530)

24
Item Writing - Application
25
Item Writing - Application
26
Item Writing - Application
27
Preparing Assembling the Test

Provide general directions
Time allowed (allow enough time to complete test)
How items are scored
How to record answers
How to record name /ID
Arrange items systematically
Provide adequate space for short answer and essay
responses
Placement of easier harder items

28
Interpreting test scores

Teachers
High scores good instruction
Low scores poor students
Students
High scores smart, well-prepared
Low scores poor teaching, bad test

29
Interpreting test scores

High scores
too easy, only measured simple educational
objectives, biased scoring, cheating,
unintentional clues to right answers
Low scores
too hard, tricky questions, content not covered
in class, grader bias, insufficient time to
complete test

30
Item Analysis

Main purpose of item analysis is to improve the
test
Analyze items to identify
Potential mistakes in scoring
Ambiguous/tricky items
Alternatives that do not work well
Problems with time limits

31
Reliability

The reliability of a test refers to the extent to
which a test is likely to produce consistent
results.
Test-Retest
Split-Half
Internal consistency
Reliability coefficients range from 0 (no
reliability) to 1 (perfect reliability)
Internal consistency usually measured by
Kuder-Richardson 20 (KR-20) or Cronbachs
coefficient alpha

32
Internal Consistency Reliability

High reliability means that the questions of the
test tended to hang together. Students that
answered a given question correctly were more
likely to answer other questions correctly.
Low reliability means that the questions tended
to be unrelated to each other in terms of who
answered them correctly.

33
Reliability Coefficient Interpretation

General guidelines for homogeneous tests
.80 and above Very good reliability
.70 to .80 Good reliability, a few test items
may need to be improved
.50 to .70 Somewhat low, several items will
likely need improvement (unless short test 15 or
fewer items)
.50 and below Questionable reliability, test
likely needs revision

34
Item difficulty1

Proportion of students that got the item correct
(ranges from 0 to 100)
Helps evaluate if an item is suited to the level
of examinee being tested.
Very easy or very hard items cannot adequately
discriminate between student performance levels.
Spread of student scores is maximized with items
of moderate difficulty.

35
Item difficulty2

Moderate item difficulty is the point halfway
between a perfect score and a chance score.

36
Item discrimination1

How well does the item separate those that know
the material from those that do not.
In LXR, measured by the Point-Biserial (rpb)
correlation (ranges from -1 to 1).
rbp is the correlation between item and exam
performance

37
Item discrimination2

rpb means that those scoring higher on the exam
were more likely to answer the item correctly.
(better discrimination)
- rpb means that high scorers on the exam
answered the item wrong more frequently than low
scorers. (poor discrimination)
A desirable rpb correlation is 0.20 or higher.

38
Evaluation of Distractors

Distractors are designed to fool those that do
not know the material. Those that do not know
the answer, guess among the choices.
Distractors should be equally popular.
( expected answered item wrong / of
distractors)
Distractors ideally have a low or -rpb

39
LXR Example 1( correct answer)
Very easy item, would probably review the
alternates to make sure they are not ambiguous
and/or provide clues that they are wrong.
40
LXR Example 2( correct answer)
Three of the alternatives are not functioning
well, would review them.
41
LXR Example 3( correct answer)
Probably a miskeyed item. The correct answer is
likely option E.
42
LXR Example 4( correct answer)
Relatively hard item with good discrimination.
Would review alternatives C D to see why they
attract a relatively low high number of
students.
43
LXR Example 5( correct answer)
Poor discrimination for correct choice B.
Choice E actually does a better job
discriminating. Would review item for proper
keying, ambiguous wording, proper wording of
alternatives, etc. This item needs revision.
44
Resources

Constructing Written Test Questions for the Basic
and Clinic Sciences (www.nbme.org)
How to Prepare Better Multiple-Choice Test Items
Guidelines for University Faculty (Brigham Young
University (testing.byu.edu/info/handbooks/better
items.pdf)

45
Thank you for your timeQuestions ???

Write a Comment

User Comments (0)