Standard Setting Methods with High Stakes Assessments - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Standard Setting Methods with High Stakes Assessments

Description:

Insert bookmark between pages when the MCC probability of a correct response ... Bookmark Method. Results often shown graphically across rounds ... – PowerPoint PPT presentation

Number of Views:479
Avg rating:3.0/5.0
Slides: 38
Provided by: barbar194
Category:

less

Transcript and Presenter's Notes

Title: Standard Setting Methods with High Stakes Assessments


1
Standard Setting Methods with High Stakes
Assessments
  • Barbara S. Plake
  • Buros Center for Testing
  • University of Nebraska

2
Setting Passing Scores
  • Essential for making high stakes decisions
  • Must ensure that qualified candidates pass
  • Must ensure that unqualified candidates fail
  • 70 correct is NOT the right answer!
  • Standard Setting -- setting the standard or
    passing score

3
Approaches
  • Empirically based
  • Regression
  • Contrasting groups/Borderline groups
  • Norm-based
  • Test Based
  • Judgmental
  • Test and candidate based

4
Empirically based methods
  • Need to know status of candidate (worthy of
    passing or not)
  • More likely in classroom settings
  • Not likely the case in licensure settings
  • Norm-based
  • Not tied to the KSAs needed to function
    effectively/safely in the profession
  • Capricious and arbitrary

5
Test Based
  • KSAs form basis for test content
  • Focus on target candidate
  • MCC
  • JQC

6
Assessment Tasks
  • Multiple choice questions
  • Good content coverage
  • Efficient scoring
  • Can measure higher order reasoning if well
    constructed

7
Constructed Response
  • More directly related to target skill?
  • Some differences by candidate
  • Time consuming to administer and score
  • Increases costs

8
Judgmental task
  • How will the minimally qualified candidate (MCC)
    perform on the tasks in the test?
  • Need qualified, well trained judges
  • Often experts (SMEs)
  • Need to modify SMEs perception to focus on entry
    level performance
  • Feedback

9
Decision Rules
  • Compensatory
  • Performance on total is what matters
  • Weaknesses in one area can be compensated by
    strengths in another
  • Higher reliability

10
Decision Rules
  • Conjunctive
  • Passing scores set on parts of the test
  • Candidates must pass all parts in order to pass
    the test
  • Sometimes candidates are allowed to bank passed
    parts

11
Test Based Methods
  • Multiple choice questions
  • Angoff Method
  • Yes/No Extension
  • Bookmark

12
Test Based Methods
  • Constructed Response
  • Analytical Judgment
  • Paper selection

13
Angoff Method
  • SMEs estimate the probability that a
    hypothetical, randomly selected MCC will be able
    to answer each question correctly.
  • Addition of SMEs estimates SMEs passing score
  • Average across SMEs recommended passing score
  • Range of probable values (SEE)

14
Angoff variations
  • Multiple rounds of ratings
  • Feedback in between
  • SME results
  • Candidate performance
  • P-values
  • passing

15
Criticisms of Angoff Methods
  • Cognitively challenging
  • Impossible task
  • Fatally flawed NRC report
  • Research has shown that ratings are consistent
    across years and raters
  • Need strong training/discussion of KSAs of MCCs

16
Yes/No Variation
  • SMEs estimate whether or not the MCC will be able
    to get the item correctly (Y/N)
  • Response probability
  • More likely than not (.50)
  • Fairly certain (.67)
  • Add the Ys to get SMEs passing score
  • Average across SMEs recommended passing score
  • Cutpoint /- SEE (1 or 2)

17
Yes/No Variation
  • More popular with SMEs
  • Feedback not necessarily needed
  • Quicker to implement

18
Bookmark Method
  • Often used with IRT calibrated items but not
    necessary
  • Test questions order from easy to hard
  • Response probability
  • Insert bookmark between pages when the MCC
    probability of a correct response dips below
    response probability

19
Bookmark Method
  • Number of items preceding bookmark is SMEs
    passing score
  • Often little discussion on KSAs of MCC
  • Multiple small groups
  • Discussion between rounds
  • Multiple rounds data usually isnt shared until
    2nd of 3rd rounds.

20
Bookmark Method
  • Results often shown graphically across rounds
  • Frequently convergence occurs after 1st round
  • Average across SMEs recommended cutpoint
  • SEE formula cutpoint /- SEE (1 or 2)

21
Constructed response tasks
  • Extended Angoff
  • Analytical Judgment

22
Extended Angoff
  • SMEs estimate how many of the total points
    available for the task will be earned by the MCC.
  • Cutpoint is determined in a similar fashion to
    Angoff sum points for SME, average across SMEs.
  • Range of probable values

23
Analytical Judgment
  • SMEs see prescored candidate responses (but
    scores arent revealed)
  • Task is to sort candidate responses into
    performance categories
  • Clearly passing
  • Passing
  • Not Passing

24
Analytical Judgment
  • Clearly passing set aside
  • Candidate responses in the Passing and Not
    Passing categories are ordered from lowest
    performance to highest.
  • Top responses in the Not Passing category are
    identified (usually 3)
  • Lowest responses in the Passing category are
    identified (usually 3)

25
Analytical Judgment
  • Average across these 6 papers is SMEs passing
    score
  • Feedback provided on SME passing scores
  • Round 2
  • Cutpoint is average across SMEs passing scores
  • Range of probable values

26
Paper Selection
  • Exemplar candidate work is selected for each
    score point (typically 2/score point)
  • SMEs task is to pick the two papers that best
    represent the work of the MCC
  • Scores are not revealed to SMEs
  • Average of SMEs selected papers SMEs passing
    score
  • Average across SMEs cutpoint
  • Range of probable values

27
Who Makes the Final Decision?
  • Each approach yielded a cutpoint and a range of
    probable values
  • This information should be communicated to the
    policy makers for their final decision.
  • Standard setting methods only yield a range of
    consistent, defensible cutpoints
  • Final decision is a policy matter!

28
Providing Validity Evidence
  • What evidence is useful in supporting the results
    of the standard setting process?
  • This evidence should be gathered to have
    available in case of a legal challenge.
  • Responsibility of test developer to provide at
    least procedural validity evidence.
  • Collatoral evidence could be part of a long-term
    validity research program

29
Procedural Evidence
  • SMEs
  • Representative of profession
  • Qualifications
  • Confidentiality
  • Conflict of interest statements
  • Cannot teach preparation classes or sit for
    examination

30
Training
  • Did SMEs understand method?
  • Was sufficient time allotted to training?
  • Did the SMEs have a clear conceptualization of
    the MCC?
  • Did they understand the purpose of the standard
    setting procedure?
  • Do they understand that the final decision will
    be based on their work, but not dictated by it?

31
Practice
  • Was enough time devoted to practice?
  • Were the practice materials sufficiently similar
    to the operational materials?
  • Did the SMEs feel they had a reasonable
    opportunity to ask questions and receive
    clarifications
  • Did they understand the feedback information?

32
Operational
  • Was enough time devoted to their work (across
    rounds)?
  • How confident did the SMEs feels about their
    ratings (across rounds)?
  • How useful/influential was the feedback?
  • Did the facilities support their work?

33
Overall
  • Confidence that the method used will result in
    appropriate minimum passing score?
  • Was the workshop handled in a professional
    manner?
  • Was the workshop well organized?
  • Opportunity for comments

34
Main Point
  • Many methods, all aimed at provided a structured
    and reasoned approach to identifying
  • Cutpoint
  • Range of probable values
  • Procedural validity evidence

35
Match of Method to Assessment
  • Method selected should be appropriate for the
    assessment (MCQ, constructed response).
  • Logistically feasible
  • Published in peer-reviewed journals?
  • Should be replicable
  • Multiple methods? Multiple panels?

36
Purpose of Presentation
  • Provide an orientation to current standards
    setting methods
  • Provide background on the needed processes and
    procedures to conduct a professional (and legally
    defensible) standard setting workshop.

37
Thank you
  • I am honored to be asked to share my expertise in
    this area
  • I hope the presentation has been useful and
    meaningful
  • Best outcome for me is if it raised your
    awareness of methods and issues in standard
    setting.
Write a Comment
User Comments (0)
About PowerShow.com