Title: New England Common Assessment Program
1New England Common Assessment Program
Science Test Item Review Committee
Meeting August 14-15, 2007 Killington, VT
2Welcome and Introductions
- New Hampshire
- Tim Kurtz
- Jan McLaughlin
- Brain Cochrane
- Stan Freeda
- Rhode Island
- Mary Ann Snider
- Heather Heineke Agnew
- Linda Jzyk
- Peter McLaren
- Vermont
- Michael Hock
- Gail Hall
- Pat Fitzsimmons
- Dave White
- Measured Progress
- Harold Stephens
- Elliot Scharff
- Amanda Smith
- Josh Evans
- Jim Manhart
- Tori Henkes
- Beneta Brown
- Susan Tierney
3New England Common Assessment Program
Cabot School, Vermont, Web Project Artwork
4NECAP Where are we now?
- Grades 38 (Reading, Math, and Writing)
- Oct 2007 Third Administration
- Jan 2008 Release Results
- Grade 11 (Reading, Math, and Writing)
- Oct 2007 First Operational Administration
- Feb 2008 Release Results
- Grades 4, 8, and 11 (Science)
- May 2008 First Operational Administration
- Oct 2008 Release Results
5Science Overview
- 20072008 Schedule
- Test Form Construction
- Bias/Sensitivity
- Depth-of-Knowledge
- Test Item Review Role of Committees
- Universal Design for Assessment
6NECAP 20072008 Schedule
- Item Review Committee meeting August 1415
- 36 teachers 12 from each state
- Bias Committee meeting August 1416
- 18 teachers 6 from each state
- Face-to-Face meetings October/November
- Test Form Production January/February
- DOE Reviews late February / early March
- Printing March
- Test Administration Workshops April 2008
- Shipments to schools April 25, 2008
- Test Administration Window May 1229, 2008
- 108,000 students from the 3 states
7Overview of Test Design
- Collaborative effort among NH, RI, and VT
- Based on common content from all three states
- Used Big Ideas of Science and the domains of
science as organizing foundations - Less about isolated facts and more about use and
application of information
8Test Design Who?
- Who?
- The NECAP includes all students educated at
public expense in grades 38 and 11 in NH, RI,
and VT. - Through explicit planning during test
construction and the use of accommodations, the
tests will be accessible to as many students as
possible. - The NECAP does not include each states
alternate assessment and English language
proficiency assessment programs.
9Test Design What?
- What?
- The content, skills, and depth of knowledge
contained in the Assessment Targets of each
states Grade Span Expectations (GSEs). The
Assessment Targets were developed jointly by the
three states expressly for this assessment
program. - Physical Science, Life Science, and Earth Space
Science at the end of grades 4, 8, and 11. - Each test will be designed to measure a range of
student achievement across four performance
levels.
10Test Design Why Spring Testing?
- Why spring testing?
- Critical transition points
- Grade 4 to 5, 8 to 9, and HS to beyond
- National Standards
- General agreement at transition points
- High School Schedule
- 4-by-4 block scheduling
- Science is not (yet?) part of AYP
11Test Design How?
- How?
- Operational Test
- Three Sessions
- Sessions 1 and 2 MC and CR items grouped
together in three domainsLife Science, Physical
Science, and Earth Space Science - Session 3 Performance Task
12Test Design Performance Task
- Performance Task
- Session 3 will be a performance task
- Looking at inquiry and science process
- Focus on one assessment target within INQ code
- Scenario (story) driven
- Work in groups of two or three to begin the
session, then answer questions individually - Focus will vary by grade
- Grade 4 Always hands-on design an experiment
- Grade 8 Sometimes like Grade 4, sometimes like
Grade 11 - Grade 11 Students will be given data and asked
to draw conclusions
13Test Design Forms Construction
- Forms ConstructionCommon/Matrix Design
- Common Items
- A common set of items completed by all students
- All achievement level scores (student, school,
district, and state) are based solely on common
items - Matrix-Sampled Items
- Unique sets of items distributed across forms
- Includes equating and field test items
14Bias/Sensitivity Review
- How do we ensure that this test works well for
students from diverse backgrounds?
15What Is Item Bias?
- Bias is the presence of some characteristic of an
assessment item that results in the differential
performance of two individuals of the same
ability but from different student subgroups. - Bias is not the same thing as stereotyping,
although we dont want either in NECAP. - We need to ensure that ALL students have an equal
opportunity to demonstrate their knowledge and
skills.
16Role of the Bias/Sensitivity Review Committee
The Bias/Sensitivity Review Committee DOES need
to make recommendations concerning
- Sensitivity to different cultures, religions,
ethnic and socio-economic groups, and
disabilities - Balance of gender roles
- Use of positive language, situations, and images
- In general, items and text that may elicit strong
emotions in specific groups of students, and as a
result, may prevent those groups of students from
accurately demonstrating their skills and
knowledge
17Role of the Bias/Sensitivity Review Committee
The Bias/Sensitivity Review Committee will not
make recommendations concerning
- Reading Level
- Grade-Level Appropriateness
- Assessment Target Alignment
- Instructional Relevance
- Language Structure and Complexity
- Accessibility
- Overall Item Design
18Depth of Knowledge
- How do we ensure that the test contains a range
of complexity?
19Depth of Knowledge
- Level 1 Recall and Reproduction
- Recall of a fact, information, or
procedure - Level 2 Skills and Concepts
- Use information or conceptual
knowledge, two or more steps, etc. - Level 3 Strategic Thinking
- Requires reasoning, developing
plan or a sequence of steps, some
complexity, more than one possible
answer - Level 4 Extended Thinking
- Requires an investigation, time to
think and - process multiple conditions of
the problem
20Test Item Review Committees
- This assessment has been designed to support a
quality program in science. It has been informed
by the input of hundreds of NH, RI, and VT
educators. Because we intend to release
assessment items each year, the development
process continues to depend on the experience,
professional judgment, and wisdom of classroom
teachers from our three states.
21Role of the Test Item Review Committees
- Today you will be looking at test items in
science. - The role of Measured Progress staff is to
facilitate the discussion and capture
recommendations that are clear and defensible for
test items. -
- The role of DoE content specialists is to
listen, ask clarifying questions as necessary,
and explain background information. - Your role is to advise the states by actively
offering opinions based on content knowledge and
grade-level expertise.
22Role of Test Item Review Committees
- You will be asked to review all items against the
following criteria - Assessment Target Alignment
- Correctness
- Depth of Knowledge
- Language
- Universal Design
- Finally you will recommend each item for field
testing, revision, or rejection. - Each committee member will complete a form to
gather this information about each item.
23Role of Test Item Review Committees
- You will also be asked to provide group feedback
on the following question - Does this item measure more specific knowledge
and ideas that might be part of an end-of-unit
test or does it measure extended learning that
would be part of a cumulative science assessment?
24Role of Test Item Review Committees
- You will also be asked to provide group feedback
on the inquiry task by answering the following
questions - 1. Is it possible for students at this grade
level to answer the questions without completing
the task? - 2. Do the questions related to this task require
scientific knowledge and understanding to answer? -
25Role of the Test Item Review Committees
- You are here today to represent your diverse
perspectives. We hope that you - share your thoughts vigorously and listen just as
intenselywe have different expertise and we can
learn from each other, - use the pronouns we and us rather than they
and themwe are all working together to make
this the best assessment possible, and - grow from this experienceI know we will.
- And we hope that today will be the beginning of
some new interstate friendships.