DATA WISE - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

DATA WISE

Description:

It means getting good at working together. The Data Wise ... 'When we ask teachers to look at evidence of their school's ... (DRA, DIBELS) ... – PowerPoint PPT presentation

Number of Views:173

Avg rating:3.0/5.0

Slides: 37

Provided by: edublog

Category:

more less

Transcript and Presenter's Notes

Title: DATA WISE

1
DATA WISE MARCH 14 2008
2
Using data effectively does not mean getting
good at crunching numbers. It means getting good
at working together.
The Data Wise Process Developed At Harvard
University To help educators use the piles of
student assessment results that land on their
desks to improve learning in their schools?
3
(No Transcript)
4
ORGANIZING AND PREPARING DATA
DISCUSSING AND USING DATA IN PRODUCTIVE WAYS

Data Mentors
Data Mentors
Data Days At schools
Guide Professional Development Of
coaches/ teachers
Intervention Teams
Direct Support Of School Data Personnel
Capacity to influence Optional
programs Programs Based on Data
Involvement With School Improvement
5
Chapter Two Building Data Assessment Literacy
Principles for interpreting data
Types of Measures
Keeping it in perspective
What does SLCSD Do?
Strategies for interpreting data
Data through different lenses
6
Technical vs Cultural Problem
Principles for interpreting data

When we ask teachers to look at evidence of
their schools effectiveness, we are not just
asking them to crunch numbers and plot graphs.
Thats the technical part. The reality is that
we are challenging the existing culture.
From Getting Excited About Data Combining
People, Passion, and Proof to Maximize Student
Achievement, by Edie Holcomb

7
Principles for Interpreting Data

Sampling
Discrimination
Error
Reliability
Score Inflation

Multiple measures allow for a more complete
picture of student performance
8
Sampling

Concept The curricular domain
Impossible to test entire domain
The test is a sample of the curricular domain
Representativeness describes the extent to which
the sample covers the domain
The point is to generalize from the test results
to the entire domain

9
Discrimination

We count on the test to be useful in determining
who has or hasnt mastered the content
Example

When did the civil war end? a. 1950 b. 1492 c.
1776 d. 1865
When did the civil war begin? a. 1861 b.
1865 c. 1867 d. 1863
10
Error

All tests have error
Measurement error
Random student error
Random test error
The goal is to minimize error, but it never goes
away completely, so take this into account in
analyzing results
Utah accountability systems use statistical tools
to help mitigate the effects of error

11
Validity and Reliability

Validity Does the test really measure what we
say it measures? Is it valid?
Reliability Are test results stable over
multiple administrations?

12
Score Inflation

Scores typically rise over time but rising scores
do not always indicate more or better students
learning

Other explanations include Low-hanging fruit
good but limited life cycle Teaching to the
test the good kind the bad kind Cheating
- or inadvertent admin mistakes
13
Roadblocks to Using Data Effectively

This will just lead to mindless test prep!

The best kind of test preparation is one where
students must think, reason, write, apply, and
communicate their understanding.

14
Gaming the system

inappropriate reallocation
of resources
attendance issues(Sept.15th)
narrowing content taught
inflation over time
Specific practices (pg51)

When students gain true mastery, improvement
shows in many places, other tests, quality of
academic work- not just on one CRT.
15
Examples of good Teaching to the Test

Teaching more
Working harder
Working more effectively
Rules of thumb
Does it teach in a way that kids will remember
past the test window?
Does it generate real understanding?
Would improvements generalize to other tests and
to the real world?

16
Example
17
Types of Measures

Measures are the yardstick that is used to
measure student performance.
The more measures that are used, the more robust
and complete the picture
State Assessments (CRT)
Usually taken in spring and reported the
following fall
Not vertically aligned
Tests vary from year to year
National Assessments (ITBS, etc.)
Some districts choose to supplement the state
assessment with a national assessment
Many of these are vertically aligned and are
aligned from year to year
National assessments are not aligned to the state
curriculum framework
Diagnostic Assessments (DRA, DIBELS)
Diagnostic assessments help identify students who
need interventions and supports
Diagnostic assessments may not be vertically
aligned
Formative Assessments and Locally constructed
tests

18
Formal Assessments (Measures)

The public, in general, supports high stakes
testing and believe that the tests are fair and
scientific. These tests have the ability to
reduce and summarize complexities of reading to
single raw scores and percentile rankings, and in
doing so they appear almost magical.
Reading Development Assessment of Early
Literacy
A Review of the Literature
Prepared for the Utah Department of Education
by Karin K. Hess, Ed. D. Center for Assessment,
Dover, NH
April 2006
Updated February 2007

19
Formal Assessments (Measures)

Research shows that teachers (who will be making
instructional and curricular decisions) tend not
to use or value test results from large-scale or
standardized assessments (National Reading
Conference Policy Brief, 2004). Classroom
teachers tend to see greater value in formative
assessment and skilled teachers do it constantly,
relying heavily on classroom assessment to make
instructional and placement decisions.

20
Keeping it in perspective
People demonstrate growth and proficiency That
would not show up on any single test
Locally Constructed Tests
Standardized Tests

ADVANTAGES
Items generally well written
Standard conditions of administration
Standard conditions of scoring interpretation
DISADVANTAGES
Students may guess
Designed for large numbers/non reflective of
individual difference
Scores tend to correlate negatively with risk
factors

ADVANTAGES
Designed for specific population
Designed for specific instructional purpose
DISADVANTAGES
Inadequate for research purposes
Reliability/validity issues
Should not be used for placement...teacher bias
No longitudinal ability
if test changes from year to year

21
Equitable Assessments
Making sure an assessment is fair for all groups
of students reviewed for (a) stereotypes (b)
situations that may favor one culture over
another (c) excessive language demands that
prevent some students from showing their
knowledge (d) the assessment's potential to
include students with disabilities or limited
English proficiency.
22

Data can be dangerous! You should avoid
Comparing performance on tests that have not been
aligned for example
Dont compare 3rd grade Math scale scores to 3rd
grade ELA scale scores
Making large inferences from a few data points
for example
Be wary of conclusions about a subject area based
on one item on a test
Be wary of conclusions about a students overall
level based on performance on one test
Be wary of conclusions about a students
strengths or weaknesses based on performance on
one item on one test

23
How do you measure improvement?

ADVANTAGES
Simpler to implement
Grade specific tests
Designed to measure a schools performance
DISAVANTAGES
Does not measure effective
teaching for one group over time
perpetual motion machine
Susceptible to changes in population

ADVANTAGES
Measures what students gain over time
Judge effectiveness of specific grade level
instruction
DISAVANTAGES
Only works if curriculum is vertical
Changes in population
Groups can suffer from ceiling effect

24
low minimal is random guessing you
cannot reallysay this is movement
Students in these ranges often move back and
forth from year to year depending on core mastery
Strategies for interpreting data
vertically modulated high partial in grade
one to high partial in grade two represents one
years growth (sort of)
cut score Grade 1 pass 77 Grade 2 pass
73 Every test has a different needed to pass
the test
25
(No Transcript)
26

ITBS Reports Different Types of Results
National Percentile Rank (NP)
Grade Equivalent of Average SS (GE),
National stanine of average standard scores (SS)
Normal Curve Equivalent of Average SS (NCE)

(GE) The GE is a decimal number that describes
performance in terms of year and month in school.
For example, if a fifth-grade student obtains a
GE of 6.4 on the Vocabulary test, his/her score
is like the score a typical student finishing the
fourth month of sixth grade would likely get on
the Vocabulary test.

Reading Mathematics
Number of Student included 1804
1790
Grade 3
Average Standard Score
176.6 171.9
Grade Equivalent of Ave.SS (GE)
3.3
3.0
National Stanine (NS)
5
5
Normal Curve Equivalent (NCE)
53
49
Percentile Rank- National Student Norms (NP)
47
55

(NCE) appropriate to average NCEs when describing
group performance or when checking growth over
time. These scores are normalized standard
scores. They have equal intervals.
They range from 1 to 99 with the average of 50
The individual percentile shows a student's
relative position or rank compared to the norm
group. Because NCEs cover the same score range as
percentile ranks (1-99),the two types of scores
are sometimes mistakenly interchanged.

(NP) percentile shows a student's relative
position or rank compared to the norm group. For
example, if a student earned a percentile rank of
62 on the language test, it means that she/he
scored higher than 62 percent of the students in
the norm group.

(SS) Standard stanine scores range from 1 to 9
and have an average value of 5. They also can be
considered groupings of national percentile
ranks, they are coarse groupings and less
precise. The fact that 23 and 24 are consecutive
NP ranks, but in different stanines (34) points
out the potential problems with stanines.

SNAPSHOT
HISTORICAL
LONGITUDINAL
GAINS
STUDENT LISTING
ITEM ANALYSIS

29
Snapshot
How did students perform at a certain point in
time?

Shows how a group of students performed against a
given measure at a certain point in time.
Limitations This analysis only presents one
point in time.

(Graph Type Bar)
30
Historical
How did students at a certain grade-level perform
historically?

Looks at how students at a particular grade level
performed on a given measure across multiple
years.
This is what NCLB uses to calculate AYP.
Limitations This analysis does not take into
account differences in the group of students from
year to year.

(Graph Type Stacked Bar)
31
Longitudinal
How did a cohort of students perform over time?

Looks at a cohort of students over time.
Shows real gains
Limitations Comparisons of a group of students
from one year to another are only valid using a
vertically-aligned test.

(Graph Type Bar)
32
Gains
How did students who performed at each level on a
prior assessment perform on subsequent
assessments?

Looks at the extent to which students are
improving over time or losing ground based on a
particular measure.
Limitations Caution must be used when drawing
conclusions about a given student based upon
performance on two tests.

(Graph Type Stacked Column)
33
Student Listing
What are the characteristics of specific
students?

Allows the analysis of students in a group in
relation to each other.
Conditional formatting can be added to highlight
outliers.
Limitations Student listings can be difficult to
interpret when too many data elements are
included.

TEACHER
34
Item Analysis
How did a group of students perform on an item or
on a set of items on a specific assessment?

Displays how students did on each item or within
a particular standard or strand.
Providing reference groups is important for tests
that are not aligned from year to year because
that is the only way to determine relative
performance.
Limitations Smaller sample sizes (e.g.
classroom-level) limit the inferences that can be
made

(Graph Type High-Low)
35