Robert L. Linn - PowerPoint PPT Presentation

1 / 40

About This Presentation

Title:

Robert L. Linn

Description:

National Center for Research on Evaluation, Standards, and Student Testing ... Some Rationales for Testing. Clarify expectations for teaching and learning ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 41

Provided by: larry112

Category:

more less

Transcript and Presenter's Notes

Title: Robert L. Linn

1

Educational Accountability Systems
Robert L. Linn
Paper prepared for The CRESST Conference The
Future of Test-Based Educational Accountability,
January 23, 2007
2
Test-based Accountability

Popular tool for purposes of educational reform
Accountability is one of few tools available to
policymakers to leverage changes in instruction
In use in many states since the early 1990s
Quite a range of approaches to using student test
results for accountability systems
Central component of NCLB

3
Some Rationales for Testing

Clarify expectations for teaching and learning
Motivate greater effort on part of students,
teachers and administrators
Monitor educational progress of schools and
students
Identify schools that need to be improved
Provide a basis for distributing rewards and
sanctions
Monitor achievement gaps and encourage the
closing of those gaps

4
No Child Left Behind

NCLB is the latest in a series of
re-authorizations of the Elementary and Secondary
Education Act (ESEA) of 1965
ESEA was the main educational component of
President Johnsons Great Society program
ESEA, as re-authorized every view years, is the
principal federal law affecting elementary and
secondary education throughout the country

5
Assessments

Basic skills and norm-referenced tests of 1980s
and early 90s
Nation of Risk encouragement of more ambitious
tests - performance assessments
NCLB increased uniformity of assessments for
grades 3-8 of reading and mathematics

6
Content Standards

States encouraged to develop content standards by
Goals 2000 and IASA
NCLB requires all states to have academic content
standards in reading/English language arts,
mathematics, and science
All states adopted content standards by 2005 to
meet requirements of NCLB if they had not already
done so

7
NCLB

States required to adopt challenging
academic content standards that specify what
children are expected to know and be able to do
coherent and rigorous content and encourage
the teaching of advanced skills (NCLB, 2001,
part A, subpart 1, Sec. 1111, a (D).

8
Performance Standards

Called Academic Achievement Standards by NCLB
Absolute rather than normative
Establish fixed criterion of performance
Intended to be challenging
Relatively small number of levels
Apply to all, or essentially all students
Depend on judgment

9
Standards Movement

High expectations of NCLB consistent with the
standards movement of 1990s
National Assessment of Educational Progress
(NAEP) standards (called achievement levels) set
at ambitious levels
NAEP 1990 proficient level in mathematics set at
high levels
Grade 4 87th percentile 13 proficient or
above
Grade 8 85th percentile 15 proficient or
above
Grade 12 88th percentile 12 proficient or
above

10
(No Transcript)
11
(No Transcript)
12
States with the Highest and Lowest Percent
Proficientor Above on State Assessments in 2005

Highest
Reading Grade 4
Mississippi 89
Reading Grade 8
North Carolina 88
Math Grade 4
North Carolina, 92
Math Grade 8
Tennessee 87

Lowest
Reading Grade 4
Missouri 35
Reading Grade 8
South Carolina 30
Math Grade 4
Maine Wyo. 39
Math Grade 8
Missouri 16

13
Contrasts of Percent Proficient or above on
NAEPand State Assessments (Grade 8 Mathematics)

NAEP
Missouri 21
Tennessee 26

State Assessments
Missouri 16
Tennessee 87

14
Alignment

Alignment of assessments and content standards
viewed as critical by proponents of
standards-based reform
NCLB peer review requires states to demonstrate
alignment, usually through studies by independent
contractors

15
Alignment of Assessments to Content Standards

Webb
Categorical concurrence
Depth of knowledge consistency
Range of knowledge correspondence
Balance of representation
Porter
Content categories by cognitive demand matrix

16
Alignment of Assessments to Content Standards
(Contd)

Achieve
Content centrality
Performance centrality
Challenge
Balance
Range

17
Approaches to Test-Based Accountability

Status Approach compare assessment results for a
given year to fixed targets (the NCLB approach)
Growth Approach evaluate growth in achievement
(allowed for NCLB pilot program states)
Growth may be measured by comparing performance
of successive cohorts of students
Growth may be evaluated by longitudinal tracking
of students from year to year

18
Status and Growth Approaches

Status approach has many drawbacks when used to
identify schools as successes or in need of
improvement
Does not account for differences in student
characteristics, most importantly differences in
prior achievement
Growth approach has advantage of accounting for
differences in prior achievement, but may set
different standards for schools that start in
different places

19
NCLB Pilot Program

Five states have received approval to use growth
model approaches to determining AYP
Early results suggest that it does not radically
alter the proportion of schools failing to make
AYP
Constraints on growth models are severe, most
notably the retention of the requirement that
they lead to the completely unrealistic goal of
100 proficiency by 2014

20
Multiple-Hurdle Approach

NCLB uses multiple-hurdle approach
Schools must meet multiple targets each year
participation and achievement separately for
reading and mathematics for the total student
body and for subgroups of sufficient size
Many ways to fail to make AYP (miss any target),
but only one way to make AYP (meet or exceed
every target)
Large schools with diverse student bodies at a
relative disadvantage in comparison to small
schools or schools with relatively homogeneous
student bodies

21
Compensatory Approach

State systems often use a compensatory approach
rather than a multiple-hurdle approach
An advantage of compensatory approach is that it
creates fewer ways for a school to fall short of
targets
Hybrid models also possible that use a
combination of compensatory and multiple-hurdle
approaches

22
Disaggegation

Critical for monitoring the closing of gaps in
achievement
No real relevance for small schools with
homogeneous student bodies
However, it leads to many hurdles that large,
diverse schools must meet

23
Implications of Subgroup Results

Schools with multiple subgroups at relative
disadvantage to schools with homogeneous student
population
May want to consider combining across more than
one year as is already allowed for students with
disabilities

24
Subgroup Gains in NAEP Mathematics Average Scale
Scores (1996 to 2005)
Group Grade 4 Grade 8
White 14 8
Black 22 15
Hispanic 19 11
25
Closing Achievement Gaps NAEP Mathematics
Average Scale Scores (1996 to 2005)
Groups Grade 4 Grade 8
White and Black -8 -7
White and Hispanic -5 -3
26
Use of Academic Achievement Standards

Apparent closing or widening of achievement gaps
using percent above cut scores can depend on
choice of level, e.g., basic or above vs.
proficient or above
See, for example, Holland, P. W. (2002). Two
measures of changes in gaps between CDFs of test
score distributions. JEBS, 27, 3-17.

27
Subgroup Gains in NAEP Mathematics Percent At or
Above Basic or Proficient (1996 to 2005)
Grade 4 Grade 4 Grade 8 Grade 8
Group Basic Prof. Basic Prof.
White 14 20 7 9
Black 33 10 17 5
Hispanic 28 12 13 5
28
Changes in Achievement Gaps NAEP Mathematics
Percent At or Above Basic or Proficient (1996 to
2005)
Grade 4 Grade 4 Grade 8 Grade 8
Groups Basic Prof. Basic Prof.
White and Black -19 10 -10 4
White and Hispanic -14 8 -6 4
29
Gaps and Percent Above Cuts

Using differences in percent above cut scores
can give a confusing impression of a rather
simple situation (Holland, 2002)
Need to look beyond percents basic or above or
proficient or above
Compare average scale scores, effect size
statistics, and comparisons of distributions

30
Comparing States on Closing Gaps

Gaps measured in terms of percent proficient or
above on state assessments can be quite
misleading due to the wide variation in the
stringency of state definitions of the proficient
standard

31
Performance Indexes

Focusing only on percent proficient or above has
disadvantages
Does not give credit to student moving from below
basic to basic
Encourages attention to students thought to be
near the proficient cut, possibly at the expense
of other students
Performance Index scores avoid these problems

32
Illustration of MA Index Scores for a
Hypothetical School in 2006 2007
Perfor-mance Level Points N 2006 N 2007 2006 Points 2007 Points
Prof 100 50 50 5,000 5,000
NI high 75 75 100 5,625 7,500
NI low 50 100 125 5,000 6,250
W/F high 25 100 125 2,500 3,125
W/F low 0 75 50 0 0
Total 400 400 18,125 21,875
33

School Index Scores
2006 Score 18,125/400 45.31
2007 Score 21,875/400 54.69
Percent Proficient or Above
2006 12.5
2007 12.5

34
Score Inflation

Defined as .. a gain in scores that
substantially overstates the improvement in
learning it implies (Koretz, 2005)
Research has found that gains in scores in
high-stakes accountability systems often fail to
generalize to other measures of achievement
Narrow focus on past tests rather than broader
content standard can cause score inflation
Emphasis on alignment and the need to repeat a
substantial percentage of items on assessments
for year-to-year equating may contribute to score
inflation

35
Validity of Causal Inferences

Status approach does not provide a defensible
basis for inferring that higher scoring school is
more effective than a lower scoring school
Making an inference about school quality requires
the elimination of many alternate explanations of
differences in student achievement other than
differences in instructional effectiveness
Prior achievement differences
Differences in support from home

36
Inferences About Schools

Growth models rule out the alternate explanation
of differences in prior achievement
Nonetheless, causal inferences about school
effectiveness are not justified the growth
approach to test-based accountability
Many rival explanations to between-school
differences in growth besides differences in
school quality or effectiveness
Results better thought of as descriptive for
generating hypotheses about school quality that
need to be evaluated

37
School Characteristicsand Instructional Practice

School differences in achievement and in growth
describe outcomes and can be the source of
hypotheses about school effectiveness
Accountability systems need to be informed by
direct information about school characteristics
and instructional practices

38
Conclusions

Test-based accountability has become a pervasive
part of efforts to improve education in the U.S.
The features of accountability systems matter
Requirement to include nearly all students in
test-based accountability has brought needed
attention to groups often ignored in the past

39
Conclusions (continued)

Performance standards are supposed to define the
level of achievement that students should reach,
but
The definition of proficient achievement varies
so widely from state to state that it lacks any
semblance of common meaning
Using percent proficient or above a primary
indicator does not give credit for gains of
students at other levels
Using percent proficient or above to monitor gaps
in achievement is not an adequate approach

40
Conclusions (continued)

Status-based approach to accountability does not
provide a valid way of distinguishing successful
schools from schools that are in need of
improvement
Growth models have advantages over status models
but still are best thought of as providing
descriptive information rather than the providing
the basis for causal inferences about school
quality

Write a Comment

User Comments (0)