Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth - PowerPoint PPT Presentation

About This Presentation

Title:

Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth

Description:

James Marshall. Session Overview. Evaluation Approaches ... Establish goals-- set objectives-- tailor instruction to obj-- judge effectiveness. ... – PowerPoint PPT presentation

Number of Views:72

Avg rating:3.0/5.0

Slides: 30

Provided by: Educationa110

Category:

more less

Transcript and Presenter's Notes

Title: Evaluation: Testing, ObjectivetoTestItem Matching and Judgments of Worth

1
Evaluation Testing, Objective-to-Test-Item
Matchingand Judgments of Worth

EDTEC 540
James Marshall

2
Session Overview

Evaluation Approaches
Testing one possible data point in evaluation
Norm-referenced
Criterion-referenced
Objective-to-test-item matching
Measurement error, reliability and validity

3
Evaluation, typically

Typically, it doesnt happen! That said, it
should
And it is required for many funded projects
What happened? Were goals and objectives
achieved? How can we find that out?
At the end is NOT the only time to measure worth.
When else?
Strategies tests, observations, surveys, chats
with managers, look at work, results

4
Evaluation Approaches

Objectivist
Belief in a reality that can be known and
measured. Prevalent in education and our
business.
Objectives-based, deceptively simple. Establish
goals--set objectives-- tailor instruction to
obj--judge effectiveness.
Measures are analytical/quantitative in nature.
Examples
Do first-graders know the letters of the
alphabet?
Can the new account representative describe the
features of each checking account as defined by
the bank?
Others?
Advantages/disadvantages?

5
Evaluation Approaches

Constructivist
Belief that people construct their own realities.
Advocates believe that truth is a matter of
consensus, not measurement against an objective
reality.
Evaluation creates detailed descriptions of that
which is inside the head of the learner.
Reliance upon open-ended exercises, observation,
cases and immersion in the field.
Observation is useful for us, in that IDs build
prototypes, conduct formative evaluations, revise
and cycle again.
Measures are qualitative in nature.
Examples
Role play exercise to deal with a hostile
customer
Theme Park Tycoon running a theme park for a
year
Essay question asking you to describe your
understanding of Educational Technology
Advantages/disadvantages?

6
Evaluation Approaches

Postmodern/Critical
Objectivists proclaim objectivity.
Constructivists approve of subjectivity.
Postmoderns are social activists.
Focus on questions of power, Who are you to set
objectives for others? Use of deconstruction to
see whats inside texts and materials.
Most interested in the hidden curriculum, such as
the teaching of traditional gender roles.
What does the curriculum teach?
Why should IDs care about this evaluation
approach?

7
Evaluation FrameworksKirkpatricks Model

Level 4 Does it matter? Does it advance
strategy?
Level 3 Are they doing it (objectives)
consistently and appropriately?
Level 2 Can they do it (objectives)? Do they
show the skills and abilities?
Level 1 Did they like the experience?
Satisfaction? Use? Repeat use?

8
Evaluation Frameworks CIPP

Context assesses program/product needs, problems
or opportunities specific to the project
environment.
Input to assess, evaluate and allocate project
resources in order to meet identified needs and
objectives, solve problems, and optimize program
impact.
Process assesses project implementation.
Product assesses planed and unintended
(unforeseen) outcomes, both to keep a project on
track and to determine effectiveness or impact.

9
Types of Tests

Used to evaluate changes in skills and knowledge
Is testing alone sufficient?

10
Test Types Norm-Referenced

Compare an individual's performance to the
performance of other people.
Require varying item difficulties.
Assume not everybody is going to "get it"
Discern those who "got it" from those who didn't.

11
Normal Distribution
12
Test Types Norm-Referenced

Norm-referenced tests compare the individual to
the group.
Accomplished statistically by norming the test
with large numbers of people.
Consider
You sat for the GRE and received the following
scores. You need to retake the test.
What is your study plan?

13
Test Types Norm-Referenced

Limitations
Not especially helpful for
identifying individual skill deficiencies
identifying weaknesses in the instruction

14
Test Types Criterion-Referenced

Compares an individual's performance to the
acceptable standard of performance for those
tasks.
Requires completely specified objectives.
Asks Can this person do that which has been
specified in the objectives?
Results in yes-no decisions about competence.

15
Test Types Criterion-Referenced

Applications
Diagnosis of individual skill deficiencies
Certification of skills
Evaluation and revision of instruction
Limitations
Tend to focus on specific skills
Results may not reflect general aptitudes
Everyone may get an A

16
Which Test is Which?
NR CRT
IQ test GRE SDSU Writing Competency Red Cross
Lifesaving Certificate EDTEC 540 midterm and
final exams
17
Which Test is Which?
NR CRT
Give out a CA driver's license Pick students for
Russian lang. training Determine entrance into
medical school PADI Scuba Certification Select
one EDTEC scholarship recipient Figure out where
to revise a course Decide which students need
remediation
18
Utility of Test Scores

Selection screening (before)
mastery of prerequisites -- for
remediation/placement
mastery of course objectives -- for acceleration
(testing out)
Individual diagnosis and prescription (along the
way)
Practice (along the way)
Grades summative scores (at or after the end)
promotion
certification and licensure
Administrative
course evaluation
trainer accountability

19
Criterion-referenced Test Items
Objectives Items

Here is a map of the USA with the states
outlined-- but no names. Use the state
abbreviations and fill them in-- you've got 15
mins to get at least 45.
Take a look at this pair of shoes. What problems
do you see? What will you need to fix them?
The goal of the instruction is "ID's will know
how to write resumes." Write at least 2
objectives with all four parts.

Given a map of the USA with state borders marked,
the lwbat write the abbreviation for 45 of 50
states in 15 mins. Given a pair of well-worn
shoes, the lwbat identify what's wrong with the
shoes and the tools and materials necessary to
fix them. Given a goal, lwbt write at least two
appropriate objectives with proper ABCD parts.

20
Matching Test Items to Objectives

Matching ensures validity
Validity is the extent to which the test measures
what is important to performance. Does a high
score on the test equate to high performance on
the job?
The validity of a criterion-referenced test is
enhanced when
objectives match real-world performances (based
on solid analysis)
test items match stated objectives (including
condition).

21
Match, or Not?

Given any stocked fruit or vegetable, the Ralphs
Grocery Checker will be able to verbally state
the code which matches the produce provided with
100 accuracy.
Here is a persimmon from the produce department
and the produce code job aid. Please state the
produce code for this item. You may examine the
persimmon and reference the job aid.

22
Match, or Not?

Given a tree in need of pruning, the gardeners
apprentice will be able to select the correct
tree pruning device, based upon the type of tree
presented.
Here is an overgrown elm tree. Please select the
appropriate tool with which you will prune the
tree.

23
Match, or Not?

Given a descriptive order for a Café Mocha,
including size, caf/decaf, type of milk, the
barista will be able to create the drink as
specified in the Starbucks Guide to Coffee
Creations.
A customer has just ordered a Grande, non-fat,
mocha. Please list the ingredients you will
need, and describe the steps you would take to
create the drink.

24
Evaluating a Training Program

Consider
Your evaluation uses a criterion-based test to
see if the new account representatives can
describe the different types of accounts offered
by the bank.
All representatives were able to meet the
specified criteria
Case closed or, do you want to know more?

25
Ideas in Testing

Measurement Error
Validity
Reliability

26
Measurement Error

Many causes
mechanical or scoring errors
poor wording (confusing, ambiguous)
poor subject matter, content (validity)
score variation from one time to another
(reliability)
score variation from "equivalent" tests
test administration procedure
inter-rater reliability
mood of the student

27
Validity

Does the test assess what's important? Does it
really seek out the skill and knowledge linked to
the world? (content validity)
Types
Content Validity (most important to us)
Predictive Validity (e.g. SAT, GRE)

28
Reliability

Are the scores produced by the test trustworthy
and stable over time?
Assessed by
parallel (equivalent) forms or test-retest
internal consistency

29
Testing and Evaluation

A Look Ahead
ED 690 Procedures of Investigation
Provides introduction to evaluation procedures
and methods
Introduces research process, statistical analysis
ED 791A, 791B, 791C
Evaluation sequence most often completed by EDTEC
students, over writing a thesis
Conduct a full-scale evaluation (design,
research, report) for a living, breathing client
over a two-semester timeframe