Developing Alternate Assessment Technical Adequacy presentation

About This Presentation

Title:

Developing Alternate Assessment Technical Adequacy

Description:

could recognize a few names of family members and his own name but that he had ... Tasks: Symbol meaning, letter names, word reading, sentence reading, passage ... –

Number of Views:94

Avg rating:3.0/5.0

Slides: 86

Provided by: bethcip

Category:

more less

Transcript and Presenter's Notes

Title: Developing Alternate Assessment Technical Adequacy

1
Developing Alternate Assessment Technical Adequacy

ASES DAATA Project
Large-Scale Assessment Conference
June 20, 2005

2
Enhanced Assessment Instrument Grant

Title VI, Part A, Subpart 1,
Section 6112

West Virginia is the fiscal agent.

4
Preview

Overview and purposes of Project DAATA- Beth
Cipoletti
Role of ASES/CCSSO Sandra Warren
Content Validity Paper/Generalizability
Study/Website Jerry Tindal
Hanbook Prospectus Pat Almond
State Perspective Dan Farley
Reflections/Questions and Answers Jan Barth

DAATA Project Overview
Beth Cipoletti
DAATA Project Director
Office of Student Assessment Services, WVDE

6
Overview of Project DAATA

Develop systems for states to engage in
self-studies in order to determine the degree of
technical merit in their assessment instruments
Assemble exemplary measures, forms and protocols
to help states develop or adopt the requisite
instrumentation to conduct a self-study on their
alternate assessments
Propose a reporting system to communicate results
and to use them to inform decisions

7
Purposes of Project DAATA

Support state efforts to prepare for and respond
to the NCLB reviews
Develop a process for states to ensure the
technical adequacy of their alternate assessments
so that measurement has implications for
instruction
Address three types of alternate assessments
(performance events, portfolios, and
observations) to assure products are applicable
to all states
Disseminate products via NASDSE, NCEO, RRCs and
CCSSO

8
Work Scope
9
DAATA Project Topics for Review

Content validity
The development of measures that are aligned with
learning and lead to valid inferences within
specific domains of performance
Generalizabilty
The partitioning of variance that helps explain
performance and often refers to them as facets
which can include the type of task used to
measure students, the raters who judge them, or
the multiple occasions students are tested

10
Topics (cont.)

Reliability
The consistency of measures over multiple
administrations and scorings (over time,
occasion, form, person, or item) and is a
necessary prerequisite to establishing validity
of interpretations
Criterion and predictive validity
The relationship among measures of performance
that form a coherent and interpretable construct
and help define the meaning of a measure in both
convergent and divergent ways

11
Topics (cont.)

Consequential validity
The impact of measures within the context of
practice that addresses individuals and social
outcomes from the use of assessment systems to
make decisions about students, teachers,
administrators, institutions and programs

12
Accomplishments to Date
13
Content Validity Study

Active Participating States (APS) included
Arkansas, Maryland, Michigan, New Mexico,
Washington, and West Virginia
The APS shared the following alternate assessment
materials
administration manual, scoring manual, technical
reports and samples of alternate assessments.

14
Content Validity (cont.)

The actual study of content related validity
evidence supporting alternate assessments was
conducted in April and May. Approximately 65
teachers in 7 states collected classroom
artifacts (instructional program plans, student
work samples, perception surveys) and submitted
alternate assessments. These materials are being
analyzed for alignment and consistencies in
opportunity to learn.

15
Case Studies

Researchers at the University of Oregons
Behavioral Research and Teaching (BRT) analyzed
the data collected from the APS to assemble case
studies documenting the breadth and depth of
different states alternate assessments.
Each APS verified the accuracy of the
researchers analysis during the January 2005
meeting in Orlando.

16
Case Studies (cont.)

Each case study includes the following
Section 1 Test Development, Administration and
Scoring
Perspective/theory background
Overview purpose
Definitions/glossary key words and terms
Type of Assessment
Domain-Sampling Plan description of all
possible tasks
Test Specifications/Blue Print description of
target population, format of tasks and content

17
Case Studies (cont.)

Administrative Procedures directions to collect
student work, purpose
Items/Tasks (format and amount) setting,
context
Scoring method to assign value to students
response
Score Metric method to aggregate and implement
decision rules to combine scores

18
Case Studies (cont.)

Section II Study of Alignment with State
Content Standards
Application of alignment procedures
Categorical Concurrence
Depth of Knowledge
Range of Knowledge
Balance of Representation
Standard Setting
Analysis and interpretation of findings from
study with recommendations for next steps

19
Content Validity Study (cont.)

The content validity technical paper was revised
to provide a more focused direction for states
Three dimensions were highlighted
Domain for sampling tasks and behaviors,
including the specifications for representation
Alignment between alternate assessments and state
content standards
Linkages with classroom opportunity to learn
and the overlap or under lap between the tasks on
the alternate assessment and those practiced in
the classroom

20
Content Validity Study (cont.)

Draft content validity technical paper has
undergone an electronic review
Reviewers
Eileen Ahearn, NASDSE,ASES
Sue Bechard, Measured Progress, ASES
Dan Farley, New Mexico, ASES
Aran Felix, Alaska, ASES
Carolee Gunn, Utah, ASES
Laurie Davis, Pearson, TILSA
Gretchen Ridgeway, DODEA, TILSA

21
Content Validity Study (cont.)

Content validity technical paper Focus Group
Reviewers
Eileen Ahearn, NASDSE,ASES
Sue Bechard, Measured Progress, ASES
Carolee Gunn, Utah, ASES

22
Generalizability Study

The study involved 65 teachers and 75 students
in seven states. The generalizability study will
provide estimates of reliability associated with
specific facets of our measurement process. A
careful empirical study of dominant types of
alternate assessments will provide comparative
estimates of measurement validity

23
Generalizability Study

Active Participating States Alaska, Iowa, New
Mexico, Oregon, Utah, Washington, and West
Virginia
Teachers were asked to collect the following
data
Copies of IEPs
Completed Instructional Surveys
Student classroom work
Copies of Alternate Assessments

24
Generalizability Study (cont.)

Language Perception Assessment Survey
Rates students on communication skills commonly
used during daily living and attendance at school
Four skill levels
Traditional
Beginning
Emerging
Pre-Emergent
Modes of communication may include
verbalizations, sign language and or augmentative
and alternative communication systems

25
Generalizability Study (cont.)

Reading Performance Assessment
Teachers administered to students in May
Common set of tasks that reflect both receptive
and expressive dimensions
Letter and word recognition
Comprehension
Students will be administered both types of tasks
Performance will be rated by trained judges on
proficiency of reading skill
Order of tasks and forms were randomly assigned

26
Generalizability Study (cont.)

Data obtained from study will be used to
Correctly estimate the performance error term
Identify the sources of error variance associated
with each of the study facets

27
Reviewers for Generalizability Technical Paper

Mary Roan, North Carolina
Sharon Hall, Maryland
Brian Touchette, Delaware
Sue Bechard, Measured Progress
Betsy Case, Harcourt
Sheryl Lazarus, NCEO

28
Reliability Study

APS include Michigan, Connecticut, New Mexico,
West Virginia, Maryland, Texas, and Delaware
APS will participate in the following ways
Supply samples of training and scoring materials
Administer alternate assessment in the fall and
re-administer at a later date in order to measure
stability
Submit extant data stripped of personally
identifiable data

29
Reliability Study (cont.)

Results for each type of alternate assessment
(portfolio, observation and performance) will be
analyzed for the following reliability
characteristics
Quality Administration
Fidelity of Administration
Inter-rater Reliability
Internal Consistency

30
Reliability Study (cont.)

Quality Administration
Analysis of teacher training materials and
scoring training materials (where applicable)
Fidelity of Administration
Administration procedural conformity
Inter-rater Reliability
Scoring Agreement
Internal Consistency
Item cohesiveness in measuring a single construct

Role of ASES and CCSSO
Sandra Warren
CCSSO Consultant

32
Organization Chart
33
ASES DAATA Project Members

ASES Study Groups
DAATA Project Director, Beth Cipoletti, West
Virginia
Researcher, Gerald Tindal, University of
Oregon/Behavioral Research and Teaching (BRT)
Researcher, Pat Almond
BRT Research Staff
EdProgress
Technical Advisory Committee
ASES Coordinator (CCSSO), Sandra Warren
CCSSO, Mary Yakimowski

34
DAATA Management Team

Role
Oversee research and work of the project
Members
Beth Cipoletti, Project Director
Gerald Tindal, Researcher
Pat Almond, Researcher
Sandra Warren, CCSSO Consultant

35
ASES Member Involvement

Roles
Study group members
Research
Professional Development and Communication
Policy to Practice
Active Participating States (APS)
Reviewers

36
DAATA Technical Advisory Committee

Members
Diane Browder, University of North Carolina,
Charlotte
Tom Haladyna, Arizona State University West
Naomi Zigmond, University of Pittsburgh

37
Next Steps
38
Project DAATA Schedule
39

Content Validity,Research and Website
Jerry Tindal
BRT, University of Oregon

40
Research Components of DAATA

Three Studies To Date

41
State and Student Case Studies

Student level provides rich descriptions of
students within the context of state standards,
instructional programs and alternate assessments.
State level cases provide contextual information
on development and implementation of an alternate
assessment.

42
State Case Study

Assemblage of Content Evidence Descriptors
Definition of (conceptual/theoretical) approach
Definition of items
Procedural evidence and documents
Alignment of assessment with standards

43
Evidence based on Content Student Case Studies

Instructional Program Form
Instructional Survey (21 items)
Collection of Work Samples
Alternate Assessments
Language Survey

44
Example Student Descriptors

a happy child who is very expressive and willing
to try new activities.
interest in computers, movies, and music and
lives in a rural area where he has a horse.
spends almost half his day removed from the
general education classroom
has a seizure disorder and shunts that need to be
monitored at all times as well as the safety in
his environment.
requires a health plan and requires special
transportation

45
Example Student Educational Program

He spent 60 minutes per day in a special
education setting and was provided
accommodations, specially designed instruction,
supplementary aids and services, supports for
school personnel, and support for related
service.
given small group and individual instruction in
reading activities that were broken down and
repeated, an associate to provide guidance and
monitor his seizures and toileting.
Mike is given picture cues to help him transition
throughout the day

46
Example Student Skills

could recognize a few names of family members and
his own name but that he had to be prompted
because he wanted to just name the first letter.
could recognize 4 words at 80 accuracy (and had
a goal to identify 28 words).

47
Example Student IEP

1. In 36 weeks in any given setting, __ will
transition from activities and places in the
building without exhibiting behaviors
2. Student will demonstrate and state quantity,
special relationships, and attributes at 80
3. Student will identify the ending sounds and
sound out beginning reading words at 80.
4. Student will answer comprehension questions
and be able to retell story in sequential order.
5. Student will demonstrate and verbalize math
concepts beginning addition, telling time,
identify coins and values.
6. Student will write upper case and lower case
letters without a model and write reading words.
7. Student will follow the direction without 80
compliance throughout the school day.

48
District Standards

read words using suffixes, prefixes, and context
clues.
reads, interprets, and responds to a variety of
literacy and informational texts, with the
district age-appropriate grade level benchmark
requiring him to analyze story elements (e.g.,
characters and settings) finally, the benchmark
or extended benchmark was for student will
identify story elements

49
Example Student Alt Assess

Judgment of reading reflects no achievement in
reading breadth and depth age-appropriate and
curriculum-based in difficulty
exhibits 81-100 independence in use of
adaptations, demonstration of self-determination
and transfer or generalization to 4 or more
settings.

50
Example Student Alt Assess

A list of 6 words (fish, see, and, ball, car, and
yellow)
Six cards with both a phrase and a picture (a
yellow car, a car, a yellow horse, a horse and a
car, a horse, and a car and a horse).
demonstration sheet in reading showed that
student could attend to a literacy activity, read
the summary of the first section, find 4/5
different words, identify the main idea or
character, find 5 different words in the second
section 5/5 times, tell what the story was about,
and self-evaluate on finding words.
When given a passage and asked questions, student
had to be prompted to name two girls and boys, he
was correct in stating his age, the color of his
hair and shirt, and describe what he does in P.E

51
Example Student Work Samples

Two pages with a sentence (A boy and I see the
airplane) with words listed below in three
columns (box, green, in, chicken, little, put)
and nonsense words.
A sheet with the words and phrases written the
ball, I see a little car, a horse and a little
horse, yellow fish, a boy in yellow, fish, see
the airplane, a box, a boy and a horse, and a boy
and a little boy.
A sheet with a picture of a girl handing a horse
a carrot and a boy hold a chicken out of an open
cage. Two sentences appear below the picture A
put the chicken in the yellow box. I see the
little girl and a horse.
A sheet with a list of words little green,
airplane, see, chicken, yellow, the fish, put, a
girl, car, ball, I, and box.
A sheet with beginning consonants each set of 3
consonants had a picture above them. For example,
p, m, n had a caricature of the moon, v, t, s had
a saddle.

52
Example Student Program Form
53
Generalizability Study

All students take all types (expressive,
receptive), all types have a both forms (A, B).
Six raters trained on state standards
Tasks Symbol meaning, letter names, word
reading, sentence reading, passage reading
(including syntax), and passage comprehension
Facets Tasks, Forms (occasions), and Raters

54
State Standards

Analyze words, recognize words, and learn to read
grade-level text fluently across the subject
areas
Listen to, read, and understand a wide variety of
informational and narrative text across the
subject areas at school and on own, applying
comprehension strategies as needed.
Demonstrate word knowledge through systematic
vocabulary development determine the meaning of
new words by applying knowledge of word origins,
word relationships, and context clues
Demonstrate general understanding of grade-level
informational text across the subject areas.
Develop an interpretation of grade-level
informational text across the subject areas .
Examine content and structure of grade-level
informational text across the subject areas

55
Reliability

Standard 2.4
Each method of quantifying the precision or
consistency of scores should be described clearly
and expressed in terms of statistics appropriate
to the method. The sampling procedures used to
select examinees for reliability analyses and
descriptive statistics in these samples should be
reported (p. 32).
Standard 2.5
A reliability coefficient or standard error of
measurement based on one approach should not be
interpreted as interchangeable with another
derived by a different technique unless their
implicit definitions of measurement error are
equivalent (p. 32).

56
Reliability Studies

Inter-judge agreements
interview a sub-sample of teachers on the test
administration in the field. Administration
analyze results for each type of measure
(portfolio, observation, and performance) for
administration conditions. Administration
Each type of alternate assessment (portfolio,
observation, and performance) independently
obtain another sample. Test-Retest
rescore alternate assessment protocols. Alternate
Form

57
Reliability Studies

What kind of reliability evidence supporting
alternate assessments can be documented by
states (a) coefficients derived from parallel
forms in independent testing sessions
(alternate-form coefficients) (b) coefficients
obtained by administration of the same instrument
on separate occasions (test-retest or stability
coefficients) and (c) coefficients based on the
relationships among scores derived from
individual items or subsets of the items within a
test, all data accruing from a single
administration (internal consistency
coefficients) (Educational Standards, 1999, p.
27).
Agree to participate by taking a survey ALL
states in ASES
Agree to pony up a directory and sample of records

58
Validity Studies

Internal Structures
Response Processes
Nomological Networks
Consequences

59
WEBSITE
60
(No Transcript)
61
(No Transcript)
62
(No Transcript)
63
Web Site Information

http//www.DAATA.org
Final DAATA Documents
Links to Measurement on Alternate Assessment
Assessing Special Education Students (ASES)
Membership
Minutes from Leadership and Management Teams
(secure)
Calendar of Upcoming Events
State Technical Adequacy Documents

Handbook Prospectus
Patricia Almond, PhD
Associate in Research
Behavioral Research and Teaching

65
Handbook
66
Handbook Prospectus

Handbook for
Developing Alternate Assessment Technical
Adequacy (DAATA)
Producing Documentation for States
Alternate Assessments for
Students with Significant Cognitive Disabilities

67
Purpose for Handbook

To assist states in documenting technical
adequacy alternate assessments to
respond to the technical adequacy requirements in
federal legislation
establish an ongoing continuous improvement cycle
with evidence to monitor assessment quality

68
Intended Audience

Several specific groups were considered in
developing the contents of this handbook
State Education Offices or Divisions Responsible
for Large-Scale Assessment
Special Education Offices or Division
Assessment Technical Advisory Committees
Special Education Advisory Committees
State Vendors

69
Table of Contents

Section I Technical Documentation for Alternate
Assessment
Section II What to do and how to
proceedDetailed Guidance for States
Section III Implications for Continuous
Improvement and Informing Policy

70
Section I Technical Studies for Alternate
AssessmentChapters

1 Applying Testing Standards in a Fresh Context
2 The Challenges of NCLB and IDEA
3 Construct Validitythe Organizing Concept
4 Content Validity
5 Sources of Variance and Generalizability
6 ReliabilityRater Agreement and More
7 Criterion and Predictive Validity
8 Consequential Validity

71
Section II What to do and how to
proceedChapters

9 Step-by-Step Self-Study Guides
10 Alignment to Standards Plus
11 Addressing Variance and Generalizability
12 Reliability Rater Agreement, Internal
Consistency, and Fidelity
13 Criterion and Predictive Validity--Interpreti
ng the ResultsAchievement and Growth
14 Consequential Validity So What? Benefits
for Students and Educators
15 Stories from the StreetState Examples

72
Section III Continuous Improvement Informing
PolicyChapters

16 Making a Plan for Communication
17 Presenting Findings to Your Technical
Advisory Committee
18 Continuous Improvement Cycle
19 Technical Report

73
Process

Each study group oversees one section
A writer will produce components for review and
feedback.
Study group members provide input at regularly
scheduled meetings or via email, conference call,
website

74
Review cycle

Develop Draft
Submit to Study Group for Review
Capture comments and recommendations
Revise draft based on comments and
recommendations
Repeat Cycle

75
DAATA Website will post

Components as they are ready
Related resources
Updates to the Table of Contents
Newsletters and progress updates

www.daata.org
76
State Perspective
77
Project DAATAA Perspective from New Mexico

Dan Farley
Assessment Consultant
Special Education Bureau
June 20, 2005

78
NM Alternate Assessments

Original NM Alternate Assessment
Language Arts (Reading and Writing)
Mathematics
Science
Social Studies
NM Alternate Assessment for Reading
DIBELS?
NM Alternate Assessment for Writing (NMAC)

79
Benefits of Participation

Provides NM with useful technical adequacy
reports (did anyone say Peer Review?!?!)
Allows us to build stronger connections with
teachers and Department personnel (Now I have
some names)
Professional development opportunity for everyone
involved (mostly myself and probably not Jerry)
Contextualizes technical vocabulary words that
Ive seen defined in a myriad of ways
Because the final product of the grant will
influence, if not define, technical adequacy
requirements for alternate assessments, why not
get involved on the front end? (did anyone say
future Peer Reviews?!?)

80
Content Related Evidence

Gathered the following information to submit to
DAATA
Grade level content standards (with Expanded
Performance Standards)
Performance descriptors
Alternate achievement standards (cut scores)
Standard setting report
Test Administration manual
Sample score report forms
Decision rules (scoring metric)
Online training course
Alternate Assessment FAQs document

81
Content Related Evidence

What NMPED received in return
Content related evidence report
Assessment Development
Elaborated what the researchers found regarding
our NM Alternate Assessment perspective
Overview
Whoops, no NMPED glossary.
Thank goodness for ASES, who just published one!

82
Content Related Evidence

What NM received in return (cont.)
Content related evidence report
Instrumentation
Type of assessment
Domain sampling plan
Test specifications-blueprint
Administration
Items (format and amount)
Scoring
Metric
Standard Setting

83
Content Related Evidence

What NM received in return (cont.)
Content related evidence report
Alignment with standards
Categorical Concurrence
Depth of knowledge
Range of knowledge
Balance of representation
Reporting Level and student reports
Reporting system
Student report
Interpretation guide
Report process and protocol

84
Moving forward

New Mexico is also participating in the DAATA
sources of variance study- see me if you have
any state-related questions.
Get involved in EAGs in your state the learning
curve is steep, but needs to be traversed.
If youre an ASES state, please assist DAATA with
the Reliability study!!!
Thank you!

Reflections/Q A
Jan Barth, Executive Director
Office of Student Assessment Services, WVDE

Write a Comment

User Comments (0)

About PowerShow.com