Direct Assessments of Writing Proficiency - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Direct Assessments of Writing Proficiency

Description:

What can be achieved, not an evaluation of the state of the art in the US ... Effectiveness of writing instruction, writing program ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 23
Provided by: ameri100
Category:

less

Transcript and Presenter's Notes

Title: Direct Assessments of Writing Proficiency


1
Direct Assessments of Writing Proficiency
  • Psychometric Characteristics, Warranted
    Inferences, Importance
  • Steve Ferrara
  • American Institutes for Research
  • June 25, 2006

2
Overview for this talk
  • Another view (not a discussant), 12 years of
    direct experience
  • Defining inferences
  • Evidence of technical adequacy
  • Importance as influence on instructional
    approaches
  • My talk
  • Maryland Writing Test (MWT) and other as
    illustration
  • What can be achieved, not an evaluation of the
    state of the art in the US
  • Art/invention, constraints, judgment,
    psychometrics

3
Use and design considerations
  • Interpretation and use
  • Individual proficiency, group proficiency and
    accountability
  • Design
  • One mode or more, one prompt or more
  • Administration conditions (e.g., time
    constraints, paper or computer)
  • Expected length of response, response complexity
  • Prompt scaffolding and think-abouts
  • Scoring rubrics, rigor of training and monitoring
  • Multiple choice component

4
Inferences
  • Inferences about proficiency (of individual
    students)
  • Generalized
  • Specified writing skills
  • The composing process
  • Other inferences
  • Performance on this test
  • Writing skills
  • Writing ability
  • Subtly important distinctions

5
Inferences (cont.)
  • Inferences about student proficiency in writing
  • With tasks sampled from a domain
  • In one or more modes
  • In writing in general
  • In the composing process
  • Inferences about effectiveness of writing
    instruction
  • Individual students and groups of students
  • Effectiveness of writing instruction, writing
    program
  • Accountability for use of resources to teach
    writing

6
Four considerations to evaluate technical adequacy
  • Rater agreement and rater accuracy
  • Score reliability and generalizability
  • Test form equivalence
  • Consequences for writing instruction and
    development of student proficiency
  • To what degree does a writing assessment support
  • Inferences about what students (know and) can do
    in writing
  • Appropriate and effective writing instruction

7
Rater agreement
8
Rater agreement (cont.)
  • Review of technical reports for 11 state writing
    assessments
  • Exact agreement rates for writing assessments
    tend to be in the 60-70 range (e.g., Illinois
    and other states. (Ferrara DeMauro, in press)

9
Validity papers
  • Four packets of papers
  • GE 90 in Scoring Committee
  • Clear and borderline papers
  • Narrative
  • 86.5-94.4 of scorers achieved exact agreement
    with assigned scores
  • Packet 1 to packet 4 improvement
  • Explanatory
  • 74.4-89.0 of scorers achieved exact agreement
    with assigned scores
  • Up and down

10
Rating quality
  • Rater agreement
  • Exact and adjacent
  • Resolution of 2-point disagreements
  • Rater accuracy, quality control
  • Read-behinds
  • Validity papers
  • Recalibration
  • Retraining and dismissal

11
Score reliability
12
Decision consistency (two test forms)
13
Score generalizability
  • Synthesis of six score generalizability studies
    (Ferrara, 1993)
  • Largest variance components include the task
    facet
  • Single biggest gain in precision adding a second
    rater
  • G and D coefficients abysmal to acceptable
    (.65-.97, .16-.87
  • Depends on prompts, testing conditions,
    examinees, scoring, etc.

14
Score generalizability (cont.)
15
Score equivalence across prompts/test forms
  • Field testing
  • Test forms equating
  • Scaling studies

16
Prompt field testing
  • Data collection
  • One month prior to operational administration
  • Anchor prompts and field test prompts
  • 250 responses per prompt
  • Stratified sampling of districts and schools
    match achievement and R/E composition
  • Random assignment of prompt pairs within
    classrooms
  • Selection criteria
  • Scorability agreement and rater comments
  • Match of means, SDs, FDs to anchor prompts

17
Test forms equating
  • Prompt pre-equating
  • Data collection in random groups design
  • Equipercentile equating of prompt pairs with
    anchor prompt pairs
  • Estimation of interpolated raw score that is
    equivalent to anchor pair raw score

18
Test form equivalence
19
Prompt scaling studies
  • Summaries of eight scaling and linking studies
    (Ferrara, 1993)
  • Scale discreteness and spikiness
  • Sometimes misfit with one-parameter models (two
    studies), sometimes not (one study)
  • In short tests, unbiased estimates for
    individuals, biased estimates for PACs due to
    over-estimated score variability

20
Consequences
  • Appropriate and effective instruction
  • Development of student proficiency in writing

21
Warranted inferences and considerations
  • Clear that writing assessments do not achieve
    levels of psychometric rigor of longer tests
  • By design, because of time and dollar constraints
  • Art and judgment mixed with psychometrics (more
    than usual)
  • Should not make high stakes decisions solely on
    writing assessment performances
  • As always
  • However
  • Can we make reasonable inferences about student
    proficiency in writing?
  • Would we better off without assessments of
    writing, administered under motivated conditions?

22
References
  • Ferrara, S. (1993, April). Generalizability
    theory and scaling Their roles in writing
    assessment and implications for performance
    assessments in other content areas. Paper
    presented at the annual meeting of the National
    Council on Measurement in Education, Atlanta.
  • Ferrara, S., DeMauro, G. E. (In press).
    Standardized assessment of individual achievement
    in K-12. In R. L. Brennan (Ed.), Educational
    measurement (4th ed.). Westport, CT American
    Council on Education/Praeger.
Write a Comment
User Comments (0)
About PowerShow.com